TRANSACTIONS 


OF THE 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 
WILLIAM C. GRAUSTEIN 
EINAR HILLE 


C. C. MAC DUFFEE 


WITH THE CO JPVERATION OF 


A. A. ALBERT JESSE DOUGLAS T. H. HILDEBRANDT 

E. P. LANE R. E. LANGER SAUNDERS MACLANE 

MARSTON MORSE OYSTEIN ORE H. L. RIETZ 

H. P. ROBERTSON M. H. STONE J. L. SYNGE 

GABOR SZEGO G. T. WHYBURN OSCAR ZARISKI 
VOLUME 48 


JULY TO DECEMBER, 1940 


PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 
1940 


BOSTON Us 
COLLEGE OF ! 


‘ 


| 
| 
| 
| 
q 
q 
q 
| 
| 
q 
| 
| 
| 
| 
| 
| 
| 
| 
| 


B und 


Composed, Printed and Bound by 
Collegints Frese 
George Banta Publishing Company 
Menasha, Wisconsin 


6495+ 


a 
4 


TABLE OF CONTENTS 


VOLUME 48, JULY TO DECEMBER, 1940 


Apams, C. R., and Morsg, A. P. Continuous additive functionals on the 
space (BV) and certain subspaces. 


AGNEW, R. P. On kernels of faltung transformations 
Boas, R. P. Expansions of analytic functions 
BuRINGTON, R. S. On circavariant matrices and circa-equivalent net- 


CooLinGE, J. L. Analytic systems of central conics in space 
Douc tas, J. A new special form of the linear element of a surface 


FELLER, W. On the integro-differential equations of-purely discontinu- 
ous Markoff processes 


GARABEDIAN, H. L., and WALI, H. S. Hausdorff methods of summation 
and continued fractions 


GLEYZAL, A. Order types and structure of orders 


HALL, D. W., and WuyBurw, G. T. Arc- and tree-preserving transfor- 
mations 


HAtt, M. The position of the radical in an algebra 
Jackson, D. Orthogonal polynomials with auxiliary conditions 


Kasner, E. Conformality in connection with functions of two complex 
variables 


Martin, W. T. On a minimum problem in the theory of analytic func- 
tions of several variables 


MontTGOMERY, D., and Zippin, L. Topological group foundations of 
rigid space geometry 

Morse, A. P., and Apams, C. R. Continuous additive functionals on the 
space (BV) and certain subspaces 

_ NIveEn, I. Integers of quadratic fields as sums of squares 

Puiuips, R. S. On linear transformations 

Post, E. L. Polyadic groups 

Rutt, J. F. On a type of algebraic differential manifold 

SPENCER, D. C. On finitely mean valent functions. II 

SzAsz, O. On strong summability of Fourier series 

TORNHEIM, L. Integral sets of quaternion algebras over a function field. 436 


| 

| 

| 
1 q 

| 

| 

| 

i 

| 
q 
50 a 

| 

| 

| 

| 

i} 

| 


WALL, H. S. Continued fractions and totally monotone sequences... . 
WALL, H. S., and GARABEDIAN, H. L. Hausdorff methods of summation 
WEYL, H. Theory of reduction for arithmetical equivalence.......... 
Wuysurw, G. T., and HALL, D. W. Arc- and tree-preserving transfor- 
ZipPIn, L., and MONTGOMERY, D. Topological group foundations of 


165 


185 
126 


= 
63 
21 


ON KERNELS OF FALTUNG TRANSFORMATIONS 


BY 
RALPH PALMER AGNEW 


1. Introduction. A complex-valued function J(¢) defined over — ~ <t< 
being given, the function 


(1.1) y(s) = f + 


is, if it exists, called the faliung of the kernel J(t) and the function x(t). We 
use Lebesgue measure and integration, and let L denote the class of complex- 
valued functions x(t) integrable (and hence also absolutely integrable) over 
the infinite interval — 0 <i<o. 

It is well known that if J e L, then the faltung y(s) of each x e¢ L exists 
(that is, is finite) for almost all s, and y e L. This is implied by the computa- 


tion 
y(s)ds = s0% + #)dt = f x(s + 


which is justified by the absolute convergence of the integrals involved. If 
J(t) is an essentially bounded measurable function, say | J (t)| < M for almost 
all ¢, and x e L, then the simple estimate 


(1.2) 


(1.3) 


=u 


shows that y(s) exists and is bounded over — © <s< . Each of these results 
is of the type: If J has property P, then y has property Q for each x e L. To 
supplement such results, it is desirable to know whether the conclusion that J 
has property P can be drawn from the hypothesis that y has property Q 
whenever x belongs to an appropriate class X of functions. Doubtless the most 
pertinent questions are those for which the class X is L itself. We are able to 
obtain affirmative theorems not only when X is the class L but also when X 
is a suitable class of step functions in L. Such theorems become stronger and 


Presented to the Society, September 7, 1939; received by the editors July 18, 1939. 
1 


| 
q 
a 
| 
| 
| 
| 
a 
| 
| 
| 
| 
| 
| 
| 
30STON INIVERSITY 
OF List RAL ARTS | 


2 R. P. AGNEW : [July 


throw more light on the real character of faltung transformations when the 
extent of the class X is reduced. There is some arbitrariness in choice of the 
classes X ; we endeavor to make them at the same time as simple and illumi- 
nating as possible. 

An example may serve to illustrate a role played by step functions in 
the theory of faltung transformations. If J(¢) =exp it", n>2, then (see §6) sim- 
ple estimates show that the faltung of each ordinary step function is a 
bounded continuous function in class ZL. But Theorem 3.1 shows that there 
exist generalized step functions in class Z of which the faltung is not in class L. 

The main results of this paper are Theorems 2.1, 3.1, and 4.1 which are 
of the following type: If y(s) has property P for each x(t) belonging to a 
class X of functions, then J(¢) must have property Q. With each of these 
theorems is associated a theorem of familiar type which asserts that, if J has 
property Q, then (i) y has property P for each x e L and (ii) a certain constant 
determined by J is the bound of the transformation, that is, the least con- 
stant M such that a constant (norm) determined by y is less than or equal to 
MJ® .|x(t)|dt for each x e L. 

The class X is in each case a nonlinear subclass of L consisting of certain 
generalized non-negative step functions. Neither the class X, nor the larger 
manifold I2(X) consisting of all finite linear combinations of elements of X, 
forms a closed set in the space L in which the distance between two elements 
x(t) and x2(t) of Z is given by the familiar metric 


(1.4) fll - | 


in other words the space obtained by using the elements of J(X) and the 
metric of L is not complete. It is shown in §6 that each of Theorems 2.1, 3.1, 
and 4.1 will fail if X is replaced by certain smaller classes of step functions. 

Let S denote the special class of all real non-negative functions x = x(t) such 
that (i) x e L and (ii) there exist non-negative constants - - - , C_1, 0, C1, C2, °** 
and (depending on the particular function x) 
such that lim,._.. dn = — ©, dn= ©, and for each n= ---, —1,0, 1, 


(1.5) x(t) = Ca, St < 


Each x e S may be described as a real non-negative function in L which is a 
generalized step function(!) having a finite number of steps in each finite in- 
terval. 

Let Sy denote the subclass of S consisting of those functions in S for which 
+1, +2,--- ;eachxe Sy isa unit step function, each step 


(*) We reserve the term ordinary step function for step functions which vanish outside some 
finite interval. 


1940] FALTUNG TRANSFORMATIONS 3 


being one unit long. Each x ¢ S is bounded over each finite interval, and each 
x € Sy is bounded over — ~ <t< o. (It isa trivial remark that the last asser- 
tion would be false if in (1.5) an <an41 were replaced by a,<t<dn41.) 

2. Conditions for existence of »(s). This section is devoted to discussion 
and proof of the following two theorems. 


THEOREM 2.1. If J(t) is such that, for each x € Su, 


(2.11) y(s) = f + 


exists for at least one s in the interval —~ <s<o, then J(t) is measurable(?) 
over — 0 <t<o and for each constant 0<A< © there is a constant Ma such 
that 


+A 
(2.12) l.u.b. | | dt = Ma < 


u 


THEOREM 2.2. If J(t) is measurable and (2.12) holds, then for each x € L, 
y(s) defined by (2.11) exists for almost all s and is measurable, and for each A >0 


+A 

(2.21) | y(s)| ds S Ma f | x(t) | de 
u 

where Mz is the constant of (2.12). Moreover the constant Ma in (2.21) 1s the 

best possible one in the sense that if a measurable function J(t) satisfying (2.12) 

and A>O are fixed, then, for each C< Ma, 


utA 


(2.22) l.u.b. | y(s)| ds >C f x(t) | dé 


u 
will be true for some x © L. 


If A; and A: are finite positive numbers, then each interval uStSu+Ai 
can be covered by a finite set of intervals of the form u, StS u,.+A2; hence it 
is apparent that if the left member of (2.12) is finite for some one A >0, then 
it is finite for each A >0. Therefore the condition (2.12) is equivalent to 


(2.23) | F(t) | dt < 


—2<u<e u 


and this condition is easily seen to be equivalent to 


n+1 
(2.24) l.u.b. | F(t) | dt < 


#=0,+1, +2," Un 


(?) Perhaps little would be lost if we were to assume measurability of J(t); but the proof of 
measurability of J(#) is so simple (see the few lines following the statement of Lemma 2.3) that 
we elect to prove it rather than to assume it. 


q 
| 
i 
| 
| 
| 
a 
| 
| 
| 
| 
| 
i 
i 
| 
4 


4 R. P. AGNEW [July 


In §7 we discuss further the class of functions satisfying the inequality (2.12). 

It is a corollary of Theorems 2.1 and 2.2 that if J is such that y(s) exists 
for at least one s whenever x € L, then y(s) must exist for almost all s whenever 
x ¢ L. This does not imply that if J(¢) and x(t) are a pair of functions with 
x ¢ L for which y(s) exists for at least one s, then y(s) must exist for almost 
all s; indeed if J(¢) is 0 or ¢? according as [t], the greatest integer less than 
or equal to ¢, is even or odd, and x(t) is 1/(1+#2) or 0 according as [t] is 
even or odd, then x ¢ L and y(s) =0 when s is an integer but y(s) = © whens 
is not an integer. 

Our first step in the proof of Theorem 2.1 is to prove 


LEMMA 2.3. If J(t) is such that, for each x € Su, y(s) exists for at least one s, 
then J(t) is integrable over each finite interval aS<t<b and, for each x © Sv, y(s) 
exists for at least one s in each closed interval of unit length in the interval 


To prove Lemma 2.3, let J(t) satisfy its hypothesis and let xo(¢t) be a 
positive function x e Sy, say xo(t) =1/(1+4+ [#]?). Let so be fixed such that 


(2.31) yo(So) = + é)dt 


exists. Then if —» <a<b<o, 


f + t)dt 


exists. But the function 1/xo(so+¢) is measurable and bounded over a St<b; 
therefore J(t) = [J(t)xo(so+t) |/xo(so+t) is integrable as well as measurable 
over a<t<b. 

Now let an arbitrary function x e Sy be fixed. The function X(t) defined 
by the series 


(2.32) X(t) = > + 


n=—0o 


exists for almost all ¢, and X e Sy; that X e L is shown by the computation 


f xoa= > + m)dt 


> int n)dt = 3 f 


which is justified by the fact *' tx(t)20 and xe L;and X(t) and x(t) are con- 
stant over the same unit ic. vals. Let so be fixed such that 


= 
} 


1940] FALTUNG TRANSFORMATIONS 


(2.33) Y(so) = f 7(t)X(s0 + t)dt 
exists. Then 
(2.34) > F(t) | x(so + + 


exists, and since each term in the sum is measurable and non-negative, this 
implies that 


(2.35) f | 2(so + dat 
exists for each n. Thus 
(2.36) y(s) = f sox + t)dt 


exists when s=So, S91, so+2, - + - . Since each closed unit interval contains 
at least one of these points, Lemma 2.3 is proved. 

To complete the proof of Theorem 2.1, let J(#) satisfy the hypothesis of 
Theorem 2.1 and hence the conclusion of Lemma 2.3. To establish (2.12), we 
assume that (2.12) fails and obtain a contradiction. Failure of (2.12) implies 
that the left member of (2.24) is + © ; hence there is a sequence m1, m2, 3, - - - 
of integers such that |”,—n,| >3, pq, and 


(2.37) lim I(m_) = 


where 


n+1 
I(n) = f | dt. 


It follows from (2.37) that we can choose a decreasing sequence 6:>62> - - - 
of positive numbers such that 


a=1 a=] 
Let 
x(t) = Oa, Ma 


(2.39) 
= 0, otherwise. 


Then x(t) is real, non-negative, and constant over each of the abutting unit 
intervals n <t<n-+1; and the second of the relations (2.38) implies that x e L. 


5 | 
| 
| 
| 
| 
| 
| 
| 


6 R. P. AGNEW : [July 


Hence x ¢ Sy. Since the integrands are all measurable and non-negative, we 
find when | s| <1 


f | 7(¢ — s)|| x(2)| at 


=> | — s)|| | ae 


3ng-1 


Nat2 
= de. f | J(t — s)| dt 
a=1 


| | dt = = 0; 


a=1 a=1 


y(s) = f + t)dt 


fails to exist when | s| <1 and we have a contradiction of the fact that y(s) 
must exist for at least one s in each unit interval. This completes the proof of 
Theorem 2.1. 

Proof of the first part of Theorem 2.2 is very simple. Assuming that J(t) 
is measurable and (2.12) holds, and that x e L and A >0 are fixed, we find for 
each real u 


utA ut+tA 

lass fas + a 
=f as 


the steps are easily justified by fundamental theorems which imply also that 
y(s) exists almost everywhere and is measurable over uSsSu-+A. It follows 
immediately that y(s) exists almost everywhere and is measurable over 
—«o<s<o); and that (2.21) holds. 

In our proof of the last part of Theorem 2.2 we shall use the following 
lemma in which we choose notation to fit the application. 


LEMMA 2.4. If uis real, A>0, h>0, and J(t) is integrable over the interval 
ustsu+A+h, then 


hence 


1940] FALTUNG TRANSFORMATIONS 


utA st+é 
(2.41) | — J(s)| dt = 0. 


In case J(t) is continuous over u<tSu+A+h we can, for each e>0, 
choose 6)>0 such that | F(t) -—J(s)| <e/A when s and ¢ lie between u and 
u+A +, and |t—s| <4; letting denote the iterated integral in (2.41), 
we find that 0<J(5) <e when 0<65< 6) and (2.41) follows. In case J(t) is not 
continuous, we can show that lim sup;oJ(5) <e by use of the following in- 
equality: 


| F(t) — J(s)| S| — I) | +| — Tels) | + | — | 


in which J,(¢) is a function continuous over u<t<u+A-+h for which 
utAt+h 
f | — J.(t) | dt < 


Let a measurable function J (t) for which (2.12) holds and a constant A >0 
be fixed. For each 6>0, let x3(¢) be defined by 


OSt<6, 


(2.42) 
= 0, otherwise. 


Then 
(2.43) f | xa(¢) | dé = 1, 
The faltung ys(s) of J and x; is 


1 
y3(s) = f J (t)xs(s + = —f J (t)dt; 


1 
| ya(— s) — J(s) | s—f | — J(s)| dt 


and for each u 


u+A utA 
| s) — J(s)| ds sf | J(t) — J(s) | dt. 


Using Lemma 2.3, we obtain 


7 
5>0. | 
| 
hence | 
. | 
so that 
| 
| 
| 


R. P. AGNEW 


utA 
lim f | ys(— s) — J(s)| ds = 0, 
5-0 u 


and this implies that 


ut+A 
(2.44) lim f | | ds. 


If now C< Ma, then we can choose a fixed u such that 
(2.45) f | J(s)| ds >C, 
and then because of (2.44) and (2.45) we can choose a fixed 5>0 such that 


(2.46) | ya(— s)| ds >C. 


Using (2.42), we can write (2.46) in the form 


(2.47) l@las>c f | | ae 


this implies (2.22) and Theorem 2.2 is proved. 
The hypotheses of Theorems 2.1 and 2.2 do not imply that, if x « ZL, then 
y(s) must exist for all real s. This follows from 


THEOREM 2.5. In order that J(t) may be such that 


y(s) = f sox + é)dt 


exists for all real s whenever x © L, it is necessary and sufficient that J(t) be meas- 
urable and essentially bounded. 


A function J(t) is called essentially bounded if there is a constant M such 
that | J(t) | <M for almost all t. Sufficiency is a consequence of the well known 
fact that if J(t) is measurable and essentially bounded and &(t) e L, then 
J(t)&(t) © L; and necessity is a consequence of the well known fact that if 
J(t)é(t) e L for each — € L, then J(¢) is measurable and essentially bounded. 
If J(t) is essentially bounded, then (2.12) holds and M4, <A8 where B is the 
least constant such that | J (t)| <8 for almost all ¢; but (2.12) does not imply 
that J(¢) is essentially bounded. 

3. Conditions for ye L. It is possible to prove, by means of an extension of 
a theorem of Banach(*) and some ideas which we use in the course of proof 


(*) Banach, Théorie des Opérations Linéaires, Warsaw, 1932, p. 87, Theorem 9. The extension 
required is from the finite interval 0 $#<1 to the infinite interval — © << ©, and from real- 
valued functions to complex-valued functions. 


[July 
8 


1940] FALTUNG TRANSFORMATIONS 9 
of Theorem 3.1, that if J(¢) is such that, for each x ¢ L, y(s) exists for almost 


all sand ye L, then Je L. Theorem 3.1 below, of which we give a direct proof, 
includes this result. 


THEOREM 3.1. If J(t) is such that, for each x € S, 
(3.11) y(s) = f J(t)x(s + #)dt 


exists for almost all s and ye L, then J € L. 


THEOREM 3.2. If Je L, then for each x © L, y(s) as defined by (3.11) exists 
for almost all s, y « L, and 


where 
(3.22) M. = a; 


moreover M., ts the best possible constant in (3.21) in the sense that if C<_M., then 


(3.23) 


will hold for some xe L. 
Our first step in the proof of Theorem 3.1 is to prove 


Lemma 3.3. If J(t) is such that y e L whenever x € S, then there is a constant 
M< such that 


(3.31) fi y(s)| ds | a(t) | dé, 


where S, is the subclass of S consisting of those functions x(t) in S which vanish 
outside the interval 0St <1. 


If J(¢) satisfies the hypothesis of Lemma 3.3, and no M< © exists for 
which (3.31) holds, then for each m=1, 2,--- there is x, € S; such that 


(3.32) > | | a, 


yn being the transform of x,. Since the faltung transformation is homogene- 
ous, we can assume that the functions x,(¢) and y,(¢) are divided by the left 
member of (3.32) so that 


| 

| 

| 

| 

| 

| 
| 
| 

| 

| 


10 R. P. AGNEW 


(3.33) fo an(t) | dt < 2-*, ya(s)| ds = 1, 


Let \,=0 and choose constants a2>ai+1 such that the inequality 


an+1 
(3.34) f | yn(s — Xa) | ds > 1 — 3-" 


holds when = 1. Since 


(3.35) lim | — | ds = 1, 


we can choose \2:>Ai+1 and then choose a3>a2+1 in such a way that 
(3.34) holds when »=2. Continuation by induction furnishes sequences 
and such that Angi +1, 
and (3.34) holds for each m=1, 2,---. Since x,(¢) € Si, it follows that 
xn(t—X,) vanishes outside the interval A, StSA,+1. Let 


The series converges for each ¢ since, for each t, xn(t—X,,) #0 for at most one n. 
Properties of the sequences x, and A, imply that X e S. Hence, by hypothesis, 


(3.37) Y(s) = f + 


exists for almost all s, and Y ¢ L. Since X(t) vanishes at all points ¢ not in one 
of the mutually exclusive intervals (An, An +1), it follows from (3.37) that 


Ant1 


n=1 An 


Ant+1 


J(t — s)an(t — = f — 5)%n(t — 


n=1 


It follows from (3.33) and (3.34) that for each n=1, 2,--- 


| — Ax) | ds > 1 — 3-", k=n, 
(3.39) @ 


Ei <3-*, 


Hence the inequality 


[July 
n=1 


1940] FALTUNG TRANSFORMATIONS 


(3.41) | ¥(s)| = yu(s — =| yn(s — An) | — DO | ye(s — | 


implies that 


1 
m=1,2,---. 


(3.42) f | V(s)|ds=>1—3-"— 3-* = — 
an 
This is inconsistent with the previous conclusion that Y e L; hence M < ~ ex- 

ists for which (3.31) holds and Lemma 3.3 is proved. 
To prove Theorem 3.‘, let J(t) satisfy its hypothesis, and let x;(t) be the 
function in (2.42) which is 6-! over 0St<6 and is 0 otherwise. If 0<6<1, 
then x; ¢ Si; hence Lemma 3.3 implies existence of a constant D < © such that 


(3.43) f |) ass Df | xs(s)| ds = D, 0<6<1. 


Since, by Theorem 2.1, J(¢) is integrable over each finite interval, (2.44) must 
hold; replacing A by 2A and setting w= —A in (2.44) gives 


A A 
(3.44) J | J(t)| dt = lim s)| ds, A>0O. 


From (3.43) and (3.44) we obtain 


A 
(3.45) f | J(é) | dt < D, 
—A 
and this implies that 


(3.46) < D. 


Thus J e ZL and Theorem 3.1 is proved. 
The first part of Theorem 3.2 is well known, and we give its proof merely 
for completeness. If J « L and x e L, the computation 


| x64 9] a 
=f sola 


is easily justified and (3.21) follows. If it be assumed that 


11 | 
| 

A>0, 

| 

| 

| 

(3.47) 


12 R. P. AGNEW 


(3.48) f | ar, 


where 


(3.49) C <f | F(z) | dt, 

then we can set D=C in (3.43) to obtain D=C in (3.46) and have a contra- 
diction of (3.49). Therefore if C< M,,, then x ¢ L exists for which (3.23) holds 
and Theorem 3.2 is proved. 

4. Conditions for y « B. The measurable upper bound of a real measurable 
function &(¢) defined over — © <t< is the least number 8 such that &(¢) <8 
for almost all ¢. We write 8 = m.u.b.&(¢); and let B denote the class of all com- 
plex-valued measurable functions x(t) for which m.u.b. | x(t)| <0, 


THEOREM 4.1. If J(t) ts such that, for each xe S, 


(4.11) y(s) = sax + 


exists for almost all s and y € B, then J € B. 


THEOREM 4.2. If J € B, then for each x € L, y(s) as defined by (4.11) exists 
for all s and ‘ 


(4.21) lub. | y(s)| ef | | de 

where 

(4.22) B= mub. | 


moreover B is the best possible constant in (4.21) in the sense that, 4f C<B, then 
(4.23) lub. | y(s)| >C f | x(t) | de 


will hold for some x © L. 


Our first step in the proof of Theorem 4.1 is to prove the following lemma 
in which S and S; denote the classes of step functions previously defined in §1 
and Lemma 3.3. 


Lemma 4.3. If J(t) is such that y « B whenever x € S, then there is a constant 
M such that 


(4.31) m.u.b. | y(s)| < mf x(t) | dt, veS). 


[July 


1940] FALTUNG TRANSFORMATIONS 13 
To prove Lemma 4.3, let J(¢) satisfy its hypothesis and assume that (4.31) 


fails. Failure of (4.31) implies existence for each n=1, 2, 3, - - - of a function 
x, © S; having a transform y, such that 


| ya(s) | > af | an(t) | dt. 


We can suppose that each x,, and hence also y,, has been multiplied by the 
appropriate constant to give 


(4.32) m.u.b. | ya(s)| = 2"; f | q(t) | dt < 2-*, m=1,2,--- 


If 6;, 02, - - - is a sequence of which each element is either 0 or 1, then 
(4.33) X(t) = — n) eS. 
n=l 


Hence under our hypothesis 


(4.34) Y(s) = f + B. 


Starting with (4.34), we obtain 


(4.35) Y(s) = > Onya(s — B. 


That the conclusions just obtained are contradictory, and hence that Lemma 
4.3 is true, is a consequence of the following lemma in which we write w,(s) 
for yn(s—m). 


LemMaA 4.4. If wa(s) is a sequence of measurable functions, defined over 
—2o<s<o, such that 


(4.41) m.u.b. | OnWn(s)| < 
nel 


for each sequence 0, of which each element is 0 or 1, then there is a constant Q< 
such that 


(4.42) m.u.b. | wa(s)| <Q, n= 1,2,3,---. 
If p is a positive integer and we set 0, =0 or 1 according as n¥p or n=}, 


we see that a constant Q,< © exists such that 


(4.43) m.u.b. | wy(s)| = Qp. 


To prove (4.42) amounts to proving that the sequence Q, is bounded. As- 


| 
| 
n=1 | 

| 

| 

| 
| 
| 


14 R. P. AGNEW 


sume to the contrary that 


(4.44) lim supQ, = ©. 

Setting 0, =1 for each m in (4.41) shows that the series } ont 1W,(S) converges - 
for almost all s; hence lim,.,, wa(s) =0 for almost all s. Therefore by a theo- 
rem of Egoroff(*) w,(s) converges to 0 essentially uniformly over each set 
of finite measure | E| ; that is, corresponding to each 6>0, there is a subset F 

of E such that |E —F | <6 and w,(s) converges to 0 uniformly over F. Using 

(4.44), choose an index m, such that R,;=m.u.b. | wa(s)| >2 when n=. Let 

E, be a bounded set of positive measure such that | wa(s)| >1 when n=”, 

s Let Fi be a subset of E; such that | <| /2? and wa(s) con- 

verges to 0 uniformly over Fi. Choose m2>m such that |w,(s)| <2-? when 

s and also Re=m.u.b. | wa(s)| >3+R: when n=m. Let be a 

bounded set of positive measure such that | >2+Ri when s Eo. 

Let bea subset of such that | E:— <| /2*, | <| 
and such that w,(s) converges to 0 uniformly over F2. Choose n3>mz such 

that | wa(s)| <2-* when n =n, s € Fz, and also Rs=m.u.b.|w,(s)| >4+Rit+Re 

when n=n3. We continue by induction to obtain sequences of numbers and 

sets such that for each p=1, 2, 3,--- 


(4.45) | E,| > 0; m.u.b. | Wnp(S) | = R,; 


(4.46) | w.,(s)| > p+ Re, se E>; 
k=1 


Pp 
(4.47) Ey; wa,(s)| < 2-7, seF 51; 


k=l 


Setting, for each positive integer k, = Ex iyi , we find 


E, — G, = Ex — Fp = (& — 
p=k p=k 
so that by (4.48) 


p=k p=k 
and therefore 2 | Ex| /2 > 0. Let 


W(s) = Wn,(S)- 


(*) See, for example, E. W. Hobson, The Theory of Functions of a Real Variable, vol. 2, p. 144. 
The extension of the theorem to complex-valued functions is easily made. 


_ [July 


1940] FALTUNG TRANSFORMATIONS 


For almost all s in G, we find on using (4.45), (4.46), and (4.47) that 


tag(s) | + | — | | 


+k+ | — 
p=k+1 
Hence for each integer & there is a set of measure |G,| >0 such that | W(s)| 
2k-—1 for all s in the set. Therefore 


(4.49) m.u.b. | >> Wn,(S)| = 

| pel 
But (4.49) contradicts the hypothesis that (4.41) must hold in case @, is 1 
when =, m2, m3, --~- and 0 otherwise. This completes the proof of Lemma 
4.4 and hence also that of Lemma 4.3. 


To prove Theorem 4.1, let J(¢) satisfy its hypothesis. By Lemma 4.3 there 
is a constant D < such that 


(4.51) J pat <D x(t) | dt, 


Let, where 0<6<1, x3(¢)=6-! when 0S#S6 and x,(t)=0 otherwise. Then 
x3 € S, and it follows from (4.51) that 


1 st+é 1 —sté 
(4.52) |— f = m.u.b. |-- f D. 


6 


But since J(¢) is integrable over each finite interval, (1/6) /3*°J(¢)dt is a con- 
tinuous function of s for each 6>0. Hence it follows from (4.52) that 


1 st+é 
But, by one form of the fundamental theorem of the calculus, 
1 
(4.54) lim —f J(t)dt = J{s) 
6 J, 


for almost all s. Hence | J(s)| < D for almost all s so that 


(4.55) B= mub. D. 
—wci< 
Therefore J ¢ B and Theorem 4.1 is proved. 
To prove Theorem 4.2, let J ¢ B so ee m.u.b. | J(t)| <<. IfxeL, 
then for each s 


15 
| 
| 
| 
| 
| 
| 


R. P. AGNEW 


| »(s)| = | f + sf 


|xs+ola=ef a 


and (4.21) follows. If it be assumed that 


(4.56) lub. | y(s)| $C f “| x(t) | dé, 


where 


(4.57) C<p= mub. |J(|, 


then we can set D=C in (4.51) to obtain D=C in (4.55) and have a contra- 
diction of (4.57). Therefore if C<8, then x e L exists for which (4.23) holds, 
and proof of Theorem 4.2 is complete. 

5. Conditions for continuity of y(s). The following theorem, which we give 
mainly for comparison with other theorems, is easily proved. 


THEOREM 5.1. In order that J(t) may be such that 
(5.11) J (t)x(s + 


exists for all real s and is continuous whenever x € L, it is necessary and sufficient 
that J(t) be measurable and essentially bounded. 


Necessity is a consequence of Theorem 2.5. If 8=m.u.b. | J (t) | <o and 
x e L, then the estimate 


| + h) — y(s)| sf 


a 


together with the fact that x e L implies that the last member converges to 0 
as h-0, shows that y(s) is uniformly centinuous. 

It is interesting to note in connection with Theorem 5.1 and earlier theo- 
rems that the hypothesis that yy(s) exists and is continuous for all real s when- 
ever x € S does not imply that J e B. To prove this, let J(¢) be a function in L 
which is not essentially bounded and which vanishes outside some finite in- 


16 [July 


1940] FALTUNG TRANSFORMATIONS 17 


terval aSt<b, say J(t) =t-/? over 0<t<1 and J(t) =0 otherwise. Let x ¢ S. 
Then for each fixed real so 


exists since x(so+¢) is measurable and bounded over the finite interval a<i<b 
outside of which J(t) vanishes. If K is chosen such that | x(so+t) | <K over 
a—1<t<b+1, then when |h| <1 


| y(so + — y(so) | sf a 


sK 


and, since the last integral converges to 0 with h, y(s) is continuous at So. 
Thus Theorem 5.1 will fail if the phrase “whenever x e L” is replaced by the 
phrase “whenever x ¢ S.” 

6. Some examples. Theorem 2.1 differs from Theorems 3.1 and 4.1 in that 
the hypothesis of Theorem 2.1 involves the special class Sy of unit step func- 
tions while the hypotheses of Theorems 3.1 and 4.1 involve the larger class S. 
We are going to show that Theorems 3.1 and 4.1 will fail if S is replaced by Sy 
in their statements. For the case of Theorem 4.1, we observe that if x e Sy 
then x e B and hence that if J « LZ then y e B; therefore the hypothesis that 
y ¢ B whenever x € Sy does not imply that J ¢ B. For the case of Theorem 3.1, 
let J(t) =e?**. If x Sy, then 


y(s) = f + t)dt 


exists for each s since x(s+#) e L and e* is measurable and bounded; and the 
fact that x(s+#) is constant over unit intervals, together with the fact that 
the integral of e?*** over each unit interval is 0, implies that y(s)=0 and 
hence y e L. Since J ¢ L fails, the hypothesis that y e L whenever x ¢ Sy does 
not imply that Je L. 

We show also that none of Theorems 2.1, 3.1 and 4.1 will hold if the hy- 
potheses are relaxed to require that y(s) have the stated property only when 
x(t) is an ordinary step function. By an ordinary step function, we mean a 
finite linear combination of simple step functions; a simple step function being 
a function &(¢) such that £(¢) =1 for all ¢ in the interior of some finite interval I 
and £(t) =0 for all ¢ outside the closure of J. Except for the inconsequential 
fact that we do not require ordinary step functions to have right-hand con- 
tinuity at end points of intervals, an ordinary step function may be described 
as a function in S which vanishes outside some finite interval; hence each 


} 
| 

4 

| 


18 R. P. AGNEW [July 


ordinary step function is equal, for all except a finite set of values of s, to a 
function in S. 


For the case of Theorems 2.1 and 4.1, let 
(6.11) Ji(t) = te**, 


If x(t) =1 when a <t<6b and x(t) =0 when ¢<a and when ¢t>8, and 4;(s) is the 
J; transform of x(¢), then 


b—s 
(6.12) - yi(s) = f = — /2; 


so that y:(s) exists for all s and is continuous over — © <s< ©, and 
(6.13) | y:(s)| <1, cow, 


It follows easily that the J; transform of each ordinary step function exists 
for all s and is bounded and continuous. But J; is not essentially bounded, 
and the condition 


utA 
(6.14) | Ji(t)| dt < 
u 
fails for each A >0. Thus the hypothesis that y(s) exists and is bounded and 
continuous over — © <s< oo whenever x(t) is an ordinary step function im- 
plies neither the conclusion of Theorem 4.1 nor the conclusion of Theorem 2.1. 
For the case of Theorem 3.1, let 


(6.21) = ei 


where 7 is a fixed real number greater than 2. If x(t)=1 when a<t<b and 
x(t) =0 when ¢<a and when ¢>43, and if ye(s) is the Jz transform of x(t), then 
we have 


(6.22) f f eas, 


a—s 


Integration by parts gives, when | s| is so great that the intervala—s<t<b—s 
does not contain the origin, 
(6.23) yx(s) = — al f 
m a--s mn a—s 
Hence, for such values of s, 
+ 


so that 


| 
> 


1940] FALTUNG TRANSFORMATIONS 


(6.25) yo(s) | 2/n. 


Since ye(s) is continuous, ”>2, and (6.25) holds, we have y2 ¢ L. Thus the Jz 
transform of each simple step function is bounded, continuous, and in ZL; and 
it follows that the J, transform of each ordinary step function also has these 
properties. But J: ¢ L fails. This shows that the hypothesis that y is bounded, 
continuous, and in L whenever x is an ordinary step function does not imply 
that Je L. 

In case 2 = 2, the transformation determined by the kernel (6.21) becomes 


(6.26) y(s) = + = 


and this can be written in the form 


(6.27) n(s) = 


where 
(6.28) n(s) = = 


The function »(s) of (6.27) differs in only a simple way from the Fourier 
transform of &(¢). If &(¢) is a simple step function, it is easy to compute 7(s) 
and to show that n(s) is not in class L. 

7. The class K of measurable functions satisfying (2.12). The classes L 
and B, and the linear vector metric complete spaces associated with them, are 
well known. (See, for example, the book of Banach previously cited.) In 
Theorem 2.1 we were led to the class K of measurable functions, a member of 
which we now denote by x(¢), such that 


utA 
(7.1) lu.b. | x(t)| dt < @ 
u 
for each A >0. The class K contains ali elements of L and all elements of B. 
It is easy to show that the class K is linear, that is, if x1, x2 e K and c, ce are 
constants, then ¢1x1+ 2x2 e K. In terms of a fixed A >0 and a number ¢(A) >0 
let the norm of each x ¢ K be defined by 


utA 
(7.2) = = Lud. f | x(2) | dt. 
u 


Dependence of ||x|| on A is illustrated by the fact that if xo(t) = |¢|-/2, then 
xoe@ K and 


A/2 
(7.3) || xol| = (A) | xo(¢) | dt = 


—A/2 


19 
| 
} 
| 
| 
| 
| 
{ 
| 
i 
i 
| 


20 R. P. AGNEW 


There seems to be no compelling reason why one choice of A and ¢(A) should 
be preferred over another. If xe L and ¢(A)=1, then ||x\|4 converges to 
J2.|x(t)|dt as A>; if xe Band $(A)=1/A, then ||x||4 converges to 
m.u.b. |x(t)| as 

Assuming now that A >0 and ¢(A) >0.are fixed, it is easy to see that the 
class K becomes a linear vector metric space when the distance between two 
elements x; and x2 of K is defined by ||x2—x:||. We conclude by showing 
that this space is complete. Let x:, x2,--- be a Cauchy sequence in K so 
that as m, no. Then as m, n— 


(r+1)A 
(7.4) Inne = f | am(t) — xn(t) | dt 


converges to 0 uniformly in r. Since space L is complete, there is for each inte- 
gerr=0, +1, +2,--- afunction &,(¢) defined over rA St<(r+1)A such that 
(r+1)A 


(7.5) lim | — | dt = 0. 


no TA 


If we let &(¢) be the function defined over — © <t< © which agrees with &,(¢) 
in the interval rA St <(r+1)A, then for each real r 


(r+1)A 
(7.6) = | Xn(t) | dt 


converges to 0 as n— ©. The inequality 
(7.7) | Ime — 


together with the fact that the right member converges to 0 uniformly in r 

as m, n—© implies that the left member converges to 0 uniformly in r as 

m, n—o and hence that J,,, converges uniformly in r as n— ©. But J,,, con- 

verges to 0 as n— ©. Hence I,,, converges uniformly to 0 as no, that is, 


(7.8) lim | — x,(t)| dt = 0, 


no rA 


or 


utA 
(7.9) lim L.u.b. | E(t) — aa(t)| dt = 0. 
u 
This implies that & « K, and on multiplying by the constant ¢(A) we ob- 
tain lim ||t—x,||=0. Thus each Cauchy sequence x, in K has a limit in K, 
and completeness of the space K is established. 


CoRNELL UNIVERSITY, 
ItHaca, N. Y. 


TOPOLOGICAL GROUP FOUNDATIONS OF 
RIGID SPACE GEOMETRY 


BY 
DEANE MONTGOMERY AND LEO ZIPPIN 


Dedicated to the memory of Bella Zippin, mother of one and friend of both of us. 


1. Hilbert, after building up geometry from a point of view which relegated 
continuity considerations to the background [4], built up plane geometry 
afresh [5] on the foundation of groups of homeomorphisms of the number 
plane, both “continuity” concepts. It is this latter point of view with which 
we are concerned in this paper. Hilbert carried out this program only for the 
plane but he hinted that it might be possible to carry it out in a somewhat 
similar way for three-space. Kerékjarté took up the problem [6] for three- 
space and made a great deal of progress with it, but, as he wrote before the 
recent developments in topological groups, he found it necessary to employ 
a stronger set of axioms than is necessary now. 

Relying on the theory of topological groups we recently characterized the 
rotation group of three-space [9], and in commenting on that work P. A. 
Smith [12] suggested that it might provide the basis for an extension of 
Hilbert’s program to three-space. 

The purpose of this paper is twofold. In the first place we shall character- 
ize the classical space geometries on the basis of a fairly weak set of axioms, 
and in the second place we shall show that Hilbert’s axioms for the case of the 
plane can be weakened by replacing what might be called his “three-point 
condition” by a two-point condition. We achieve this latter purpose more or 
less incidentally to the first. 

In comparing our set of three-space axioms with Hilbert’s axioms for the 
plane we find that the first axiom is the same in both cases. The third axiom 
of this paper is weaker than Hilbert’s, and our second purpose above is to 
show that this weaker axiom also suffices in Hilbert’s case. Our second axiom 
is weaker than Hilbert’s second axiom in that it relates to the subgroup leav- 
ing a single point fixed, but it is incomparably stronger in what it asks of that 
one subgroup. 

We do not, in this paper, settle the question of whether or not Hilbert’s 
second axiom is adequate for three-space geometries. This question is bound 
up with an unsolved problem concerning transformation groups. 

Finally we wish to remark that instead of assuming that the space we are 
dealing with is ordinary three-space, it is only necessary to make certain 
topological assumptions on the space from which it follows by virtue of the 


Presented to the Society, October 28, 1939; received by the editors December 20, 1939. 
21 


| 
| | 
| 
i 


22 DEANE MONTGOMERY AND LEO ZIPPIN [July 


same group axioms that the space is actually a number-space. But we reserve 
discussion of this matter for another occasion. In this connection see the ab- 
stract by the authors in the Bulletin of the American Mathematical Society, 
vol. 45 (1939) (no. 349). 

2. The axioms. We formulate two sets of axioms, the first set for the 
plane, and the second set for three-space. The first set, to which we proceed 
immediately, is the set used by Hilbert except that it has been materially 
weakened in the manner described. 

We assume then that there is given a system (Ee, G) where FE: is the num- 
ber-plane and G is a set of sense preserving homeomorphisms of this plane, 
and that this set satisfies the following conditions: 


2.1. The system G is a group. 


The assumption tacitly implicit in the above is that G is effective, that is, 
that no element except the identity leaves all of E: fixed. 

With each point x in space there exists a subgroup G, consisting of ele- 
ments of G which leave x fixed. 


2.2. If xis any point of E, and y 1s distinct from x, then G,(y) is infinite. 


This axiom could be reformulated so that it would be entirely analogous 
to our axiom 2.2’ for three-space but we do not carry this out. It would in- 
volve almost no change in the work. 


2.3. Let (x, y) and (x’, y’) be two pairs of points of Ex, where the points of 
a pair are not necessarily distinct. If there exist pairs (Xn, Yn) and (xn, Yn), 
the first arbitrarily near (x, y), the second arbitrarily near to (x’, y’), and if 
there exists an element of G taking (Xn; Yn) to (Xn, Yun ), then there exists an ele- 
ment of G taking (x, y) to (x’, y’). 


As we have said, this set of axioms is exactly Hilbert’s except that the 
third axiom has been weakened to a condition on pairs instead of triples of 
points. 

We now formulate our axioms for three-space. We assume that there is 
given a system (£3, G) where &; is ordinary three-space and G is a set of 
sense preserving homeomorphisms of E; satisfying the following conditions: 


2.1’. The same as 2.1. 


2.2’. There exists a point p of E such that the group G, is a proper subgroup 
of G and for a sequence of points p, approaching p the sets Gp(pn) are at least 
two dimensional. 


2.3’. The same as 2.3. 


Occasionally we shall refer to the situation described by the first set of 
axioms as the plane case and to the situation described by the second set as 


1940] RIGID SPACE GEOMETRY 23 


the space case. In both cases we shall prove that ordinary geometric concepts 
such as “line” and “distance” (and in the space case “plane”) may be defined 
in terms of G in such a way that we obtain either euclidean or hyperbolic 
geometry and that G is the group of rigid motions of the corresponding geome- 
try we obtain. We do this in the plane case by proving Hilbert’s axioms [5], 
that is, by proving that the two-point condition 2.2 implies the three-point 
condition. The space case we treat in detail and show in detail that there are 
only the two systems. 

Our approach to this problem differs from Hilbert’s in one important re- 
spect. Hilbert analyzes more or less directly the topological nature of the 
orbit G,(x). In three-space this course seems to us not feasible until much 
more is known about strongly homogeneous subsets of space. But even grant- 
ing such knowledge our procedure has the advantage of making available the 
results of topological group theory. Thus, we proceed at once to a study of 
the group G, as a topological transformation group. In brief summary, we 
first confine our attention to a suitable invariant neighborhood of the point p 
where orbits under G, can be proved compact. We form the effective group in 
this neighborhood, and show that any sequence of elements of this group hasa 
subsequence which converges to a homeomorphism of the neighborhood into 
itself. We then augment our group by the addition of such homeomorphisms. 
It transpires, only considerably later, that this enlargement is an illusory one. 
The enlarged group is then shown to be a compact topological transformation 
group on a “locally euclidean” space. From our previous work, we then know 
that our orbits are necessarily manifolds, and it is not difficult to show that 
they are indeed spheres. From an earlier paper of ours we learn also the com- 
plete structure of the group and its behavior in the neighborhood. We are 
now in a position to show, by an argument patterned on one of Hilbert’s, that 
the neighborhood above coincides with space. 

The use of a “two-point” rather than “three-point” axiom shows itself in 
one or two interesting ways in the study of G, but becomes a matter of con- 
siderable moment in the study of the group G as a topological transformation 
group of the space. This argument is given in §12. To this point the case of 
the plane or of space may be treated more or less simultaneously, and it seems 
to us, in fact, that much of this generalizes with no great difficulty to four- 
space and perhaps farther. 

The remainder of the paper is devoted to a study of the geometry induced 
in space by the group G. Here, after a few paragraphs, we are on ground al- 
ready explored by Kerékj4rt6. His paper was not known to us until our own 
had been completed, and we carry out the program essentially as we had it. 
We do this in part for completeness sake, in part because the form in which 
our solution is set differs sufficiently from Kerékjarté’s. At one point in 
proving the linearity of our planes we borrow a very ingenious idea from 
Kerékjdrt6’s paper which shortens considerably an argument of our own. 


| 
1 
| 
| 


24 DEANE MONTGOMERY AND LEO ZIPPIN [July 


3. In a great part of the paper we treat the two cases simultaneously, 
calling the space (which of course is either EZ, or E3) simply E. When we speak 
of a sphere or a rotation group we of course mean the one appropriate to the 
dimension of the space. 

Let p be a point of E, which for E, may be any point, but which for E; 
is to be the point specified by 2.2’. 

Hilbert points out that for any x the set G,(x) is closed. This is a conse- 
quence of 2.3. Thus: let x, be a sequence of points in G,(x) converging to a 
limit point xo. There are elements g, in G, such that x, =g,(x). The element g, 
takes the pair (p, x) to the pair (p, x,). By 2.3 there is a g in G which takes 
(p, x) to (p, xo) and this element certainly is in Gp. 

By a similar method it is seen that G(x), which ultimately will be shown 
to coincide with £, is closed. 

4. Let R be the set of points x such that G,(x) is compact. This set is not 
vacuous for it contains p. 


LEMMA 1. The set R is open. 


Let x be any point in R, and let B be a conditionally compact open set 
containing G,(x) in its interior. Let S be the boundary of B. We assert that 
there is an open set V containing G,(x) such that G, carries no point of S inside 
V. Otherwise there must exist a set of elements g, in G, and a set of points 5, 
in S such that at least one point of the set g,(0,) lies in every neighborhood 
of the compact set G,(x). There is no loss in assuming that }, approaches 
some point 6 in S and g,(b,) approaches a point a in G,(x). But then there 
must be an element of G, taking b to a which means that a point of S is in 
G,(x) contrary to the choice of B. 

Let W denote those components of the above determined V which in- 
clude points of G,(x). The set W is open and no element of G, carries a point 
of W outside of B. For such an element would also leave an element of W 
inside B (any point namely in which W meets G,(x)), and it would therefore 
carry some point of W into S which is impossible. 

It has now been shown that all points of W have orbits inside B. There- 
fore every point of W has a compact orbit and x is in an open set W all of 
whose points have compact orbits as the lemma demands. 

4.1. The proof shows even more than is required in Lemma 1. It shows, 
for any point x in R, that x is an interior point of-a set W, such that G,(W) 
is a conditionally compact set. If W is the closure of W, then G,(W) is com- 
pact and x is seen to be an interior point of a set W such that G,(W) is com- 
pact. These facts together with the Heine-Borel theorem enable us to state 
the following lemma. 


LEMMA 2. If M is any compact subset of R, then G,(M) is compact. 


5. We consider now the action of G, on R. Conceivably G, has a non- 


1940] RIGID SPACE GEOMETRY 25 


trivial subgroup leaving all of R fixed. Later this possibility will be ruled out, 
but meanwhile we must take account of it. Let G} be the subgroup leaving all 
of R fixed. This subgroup includes the identity at any rate, and G,/G}, which 
will be denoted for brevity by H, is an effective transformation group of R. 

Our next task is to show that H may be extended to become a compact 
transformation group of R. It will be assumed that E is assigned a bounded 
metric, say the metric of a three-sphere or a two-sphere according to the 
case, which is obtained by adding a point to E. This means that we can define 
a distance between any two transformations of E into itself or between any 
two transformations of a subset of E into itself. For example if f and g are 
two transformations of R into itself 


d(f, g) = Lub. d[f(x), g(x)] 


where x ranges over R. Under this distance H becomes a metric space. 


5.1. Lemma 3. If a sequence of elements of G converges everywhere on a set M 
to a limit h, then h is continuous on M. If M is compact the convergence is unt- 
form. 


Let us prove first that 4 is continuous. Let m be any point of M and let S 
be any sphere with f(m) as a center. Let m; be a sequence of points of M 
approaching m. We shall show that almost all the points (m;) are inside or 
on S. Assume that this is not true for an infinite subsequence, say m;,, and let 
m,,m be a short arc joining m,, to m. Since g, approaches h and since h(m;,) 
is outside S by assumption, there will certainly be an integer, say ;, such 
that gn;(m,;) is outside S. We may assume without loss of generality that 
every gn,;(m) is inside S. Hence there is a point, say yz,, on the arc m,,m such 
that gn,;(ye;) is on S. Assume that g,;(y.,;) approaches a point b on S. Then gp, 
takes the pair (yx,, m) which is near (m, m) to the pair [gn,(ye;), ga,(m) ] which 
is near [b, h(m) ]. This is a contradiction from which the continuity of h fol- 
lows. 

We shall next show that the convergence is uniform in case M is compact. 
If the convergence is not uniform, there is for some positive number e an 
infinite set of indices k1, ke, - - - and a set of points m1, m2, -- - in the set M 
such that 


d[ge,(mi), h(m:)] = e. 


There is no loss in assuming that the sequence m; converges to m and 
that gi,(m;) converges to a point 6. From the above inequality 6 and h(m), 
which is the limit of 4(m;) by the continuity, are distinct. The transformations 
gx; therefore take the pair (m;, m) which is near (m, m) to the pair [gx,(mi), 
gx,(m) | which is near [b, h(m)]. By 2.3 there must be an element of G taking 
m to each of the distinct points 6 and h(m). This contradiction shows that the 
convergence is uniform. 


| 
| 
| 


26 DEANE MONTGOMERY AND LEO ZIPPIN [July 


LEMMA 3.1. Let g, be a sequence of elements of G converging to h everywhere 
on a set M. Then if m; approaches m it follows that g:(m;) approaches h(m). 


The set B containing the points m and all m,’s is compact. Hence on this 
set g, converges uniformly to h. Let e be any positive number. For all 
greater than an integer Mi, 


dlgn(x), h(x)] < e/2 


for all x in B. Since h is continuous, there will be an integer N2 such that if n 
is greater than 


d[h(mn), h(m)] < e/2. 


For all ” greater than N; and N-2 we have not only this latter inequality but 
we have as a consequence of the first inequality 


d[gn(mn), h(m,) | <e/2. 
The last two inequalities yield the desired conclusion. 


LEMMA 3.2. If a sequence f,, of elements of G, converges everywhere on a com- 
pact set M, invariant under G,», to a transformation f, then f is a homeomorphism 
of M into itself. - 


In view of Lemma 3 it is only necessary to show that f has a single valued 
inverse and that f(M) = M. 

If f does not have a single valued inverse, there must be two distinct 
points 5 and ¢ such that f(b) =f(c). Then the elements f, take the pair (0, c) 
to the pair [f,(b), fa(c) ] and by 2.3 there is an element in G which takes both 
band c to f(b). This contradiction shows that f is one-one, and since M is com- 
pact f must take M homeomorphically to f(M). We know that f(M) is a sub- 
set of M, and to complete the proof of the lemma it remains only to show that 
f(M) coincides with M. 

Let b be any point in M. Then there is an element m, in M such that 
fn(m,) =b. Assume that the sequence m, approaches a point m. Now f, takes 
the pair (m,, m) which is near (m, m) to the pair [b, f.(m)] which is near 

[b, f(m) |. Hence some element of G takes m to both points b and f(m) which 
is possible only if f(m) =b. 


5.2. LemMMA 4. The group-space H defined in §5 is conditionally compact. 


Yet Y be a countable dense subset of R. Let g, be an infinite sequence of 
elements of H. Strictly speaking the g,’s are not elements of G, but there are 
elements of G, coinciding with these elements on R, and properties of the 
group G, may be used in examining the g,’s. 

For any point y in Y the sequence g,(y) is conditionally compact and hasa 
convergent subsequence. The limit of this sequence belongs to G,(y) and isa 


1940] RIGID SPACE GEOMETRY 27 


point of R. By the diagonal process there exists a subsequence f, of the ele- 
ments g, such that f,(y) converges to a unique point of R for every element 
y of Y. Then, on Y, the sequence f, converges to a pointwise limit function f. 

It will now be shown that f is uniformly continuous in every conditionally 
compact open subset R; of R. In order to do this it must be shown that for 
every positive e there exists a positive d such that whenever y and y’ of Y- Ri 
are nearer to each other than d, the corresponding f(y) and f(y’) are nearer 
than e. If this is not the case, there must exist in Y a sequence of points y, 
and y,’ which may be supposed to converge to the same point z of R, such 
that f(y,) and f(y,/ ) also converge and converge to two distinct points y and vy’ 
in R,. This means that some of the elements g, take a pair of points near z toa 
pair near (y, y’). By 2.3 there is an element of G carrying the point z to the 
two points y and y’. This is manifestly impossible and the contradiction es- 
tablishes the uniform continuity of f on the set R:- Y. 

This uniform continuity of f permits us to extend it, and we assume it is 
so extended, to a single valued continuous transformation of R (which of 
course is locally compact) into itself. The sequence f,, which originally was 
known to converge only on Y, is now seen to converge everywhere on R to 
the transformation f. By Lemma 3 the convergence is uniform on compact 
sets which implies that the sequence f, of elements of H must be a Cauchy 
sequence because our metric brings two functions close which agree closely 
outside of a neighborhood of “infinity.” 

5.3. When we speak of a topological transformation group we use the 
term with the definition as given in our papers referred to in the bibliography. 


LEMMA 5. The group H may be extended to a compact group H which is an 
effective topological transformation group of R. 


The space H is conditionally compact so that if the space is made com- 
plete the resulting space H will be compact. The preceding lemmas and their 
proofs show that Cauchy sequences of H will actually converge to homeo- 
morphisms of R into itself. The space H is therefore a transformation group 
of R. To be sure that it is a topological transformation group we must prove 
that if g, approaches g, these being elements in H, and x, approaches x in R 
then g,(x,) approaches g(x). This follows as in the proof of Lemma 3.1. 

That H is an effective group follows from the fact that distinct elements 
of the space H arise-from nonequivalent Cauchy sequences and these give 
rise to distinct limiting transformations. 

6. The present section contains some simple, purely group theoretical, 
considerations which will be of use to us later. 


THEOREM 1. The only two dimensional manifolds which are coset spaces 
(orbits) of a compact connected group H are the two-sphere, torus, and projective 
plane. 


| 
| 
| 
| 


28 DEANE MONTGOMERY AND LEO ZIPPIN [July 


There is no loss in assuming that H is effective in its action on the mani- 
fold so that H is a Lie group [10]. The theorem then follows from the corre- 
sponding theorem on Lie groups due to Cartan [2]. 

We note without giving the proof, which is not difficult, the following: 
If a circle group operates on a torus and has a fixed point, then it must leave 
every point of the torus fixed. 


THEOREM 2. The only compact group of sense preserving transformations 
which can act effectively and transitively on a two-sphere M 1s the greup of rigid 
rotations of the two-sphere. 


For connected groups the theorem is true [9]. If H is the group, let H* 
be the identity component of H, and let x be a point of the sphere. The di- 
mension of H*(x) is the same as the dimension of H(x), namely two, and 
therefore H*(x) coincides with M. Since H* is connected it must be the two- 
sphere rotation group. Assuming H* is not all of H means that, for some y, 
(H*), is a proper subgroup of H, because every element of H, being sense 
preserving, has a fixed point; the connectedness of the group (H*), (it is of 
course circular) shows that it is the component of the identity of H,. Then 
(H*), is invariant in H,. Let M* be the decomposition space of M under 
(H*),. The group H,/(H*), acts on this space which is an interval. Hence 
H,/(H*), contains only the identity, or it is a group of two elements which 
merely interchanges the ends of M* while leaving a “middle” point fixed. The 
latter possibility cannot occur, for if it did H, would contain an element mov- 
ing y. Hence H,/(H*), contains only the identity element and (H*), is not a 
proper subgroup of H, as we assumed. The contradiction shows that H* coin- 
cides with H and that H is the group of rigid rotations of the two-sphere. 


CoROLLARY. The only compact group of sense preserving transformations of 
three-space into itself with at least one two dimensional orbit is the two-sphere 
rotation group. 


7. In the present section we confine our attention to the space case. The 
groups H and G, have the same orbits in R, and since these orbits are closed H 
has the same orbits as do H and G,. Let H* be the component of the identity 
of H. The orbit of a point under H* will have the same dimension as the orbit 
of the point under H. Hence H* has a sequence of two dimensional orbits 
approaching the fixed point p. We will now consider the action of the effective 
compact transformation group H* on the connected locally “euclidean” space 
R°, where R° denotes that component of R which contains p. (It is conceivable 
that some subgroup of H should leave all of R° fixed. We assume without 
changing our notation that this is not the case. There is no loss in this process 
as we might as well have assumed we were working with R° before.) 

It follows from theorems on the structure of coset spaces [10] that any 
two dimensional orbit of H* in R° must be a two dimensional manifold. Any 


1940] RIGID SPACE GEOMETRY 29 


two dimensional orbit of H* in R® must be, therefore, one of three types of 
manifold, the two-sphere, the projective plane, or the torus. The projective 
plane cannot be imbedded in E, so that the number of possibilities is reduced 
immediately to two, the sphere and torus. 

In the orbit space associated with R°, call it R*, every two dimensional 
orbit is a cut point of order two precisely. The set of such orbits is open [10]. 
The space R* has one non-cut point, at least, namely the orbit consisting of 
the point p only. By the cyclic element theory, R* must be either a line, a ray, 
or an interval. It cannot be a line because it contains a non-cut point. It 
cannot be an interval for this would mean that R°, an open subset of E, would 
be compact. Hence R* is a ray, and this shows that all orbits in R°, with the 
exception of », are two dimensional orbits which are either spheres or tori. 
The group H can be seen to be a Lie group because it operates on a locally 
euclidean connected space with locally connected orbits [10]. It will now be of 
dimension three at most [10], and it will be effective on each one of its two 
dimensional orbits. For, if a subgroup left all of a manifold orbit fixed this 
same subgroup would leave the whole space fixed by a simple application, as 
in an earlier paper of ours [8], of a theorem of Newman. If all two dimen- 
sional orbits are spheres, then H* is the rotation group of a sphere, for this is 
the only connected compact effective transformation group of a sphere. If all 
two dimensional orbits are tori, then H* is a two dimensional toral group, 
for this is the only Lie group which can be effective on a torus. 

It is intuitively clear that all orbits must be spheres and we now give the 
proof. We will show that if one orbit is toral then all orbits are. Assume that 
one orbit H*(x) is toral. If H* is two dimensional, then H* is a toral group and 
all orbits are tori. If H* is three dimensional there must be a circular sub- 
group K leaving x fixed. But if a circular subgroup leaves one point of a torus 
fixed it must leave every point fixed. Hence K leaves all of H*(x) fixed and, 
since this separates space, we see by a familiar device that K leaves all of 
space fixed. In this case H* is not effective. We are therefore led to conclude 
that all two dimensional orbits are tori. 

This last situation is impossible. For if H* is a toral group we can forma 
true section B of the space, that is, we can find a closed set B which has one 
and only one point on each orbit in R. The set B will have to be homeo- 
morphic to R* and will be a ray. Now let H* (x) be a toral orbit inside a neigh- 
borhood U of » which is homeomorphic to three-space. There will be in 
U —H*(x) a one-cycle Z which does not bound in U—H*(x) and which is 
outside H*(x); that is, it is contained in the component of U—H*(x) which 
does not contain p. Then using the true section B we may deform H*(x) to 
the point p. The cycle Z certainly bounds in U—p, contradicting its choice. 

Therefore not every orbit is a toral orbit and H* must be the rotation 
group of three-space, and every two dimensional orbit must be a two-sphere. 
Let K be a circular subgroup of H*. There will have to be precisely two points 


| 
| 
a 
| 


30 DEANE MONTGOMERY AND LEO ZIPPIN [July 


on each two-sphere orbit left fixed by K. These two points will define for us a 
double valued function everywhere on R*. The end point of the ray R* is an 
exception when the function is single valued. But at any rate it is possible to 
pick out of these functional values a true section B of the entire space. The 
existence of the ray B proves R® homeomorphic to euclidean three space. 

From Theorem 2 of §6 we see that H and H* must coincide, but we can 
conclude even more. 


THEOREM 3. The group H coincides with H and is therefore the rotation 
group of three-space. 


Let x be any point of R distinct from p. The set H(x) is a two-sphere, 
and H(x) coincides with H(x); or in other words H, a subgroup of ZH, is 
transitive on the two-sphere. This is possible only if H is all of H [11]. 

8. In this section we turn to the plane case, falling back on Lemma 5 
where we left it. 


THEOREM 3.1. The group H coincides with H and is a circle group, the rota- 
tion group of the plane. The set R° is homeomorphic to a plane. 


The group A cannot be totally disconnected, for such a group cannot oper- 
ate effectively on a locally planar space (as we have shown [8]). Let H* be 
the identity-component of H. As in the space case, the orbits under H* must 
be manifolds and therefore simple closed curves: they are obviously one di- 


mensional. It follows that H* is a Lie group and in particular the circle group 
[10]. As in the space case the group H must operate upon the decomposition 
space of R° by orbits under H*: this space is a ray, with p as end point, and H* 
must be the identity upon it. 

Now let g be some elemeat of H, not the identity. There must be a point x 
of R® such that gx is not x. On the other hand, gx is a point of H*(x) so that 
for some g’ of H* we have 


ggix= 


Since this element is sense preserving and leaves one point of the circle H*(x) 
fixed, it leaves all H*(x) fixed and then all of R°. Therefore it must be the 
identity and we conclude that g belongs to H*. This shows that H coincides 
with H* and isa circle group. Since H is transitive on H (x), it is obvious that 
H must coincide with H which is effective on H(x). 

Now R° is a connected open subset of the plane filled out by a continuous 
family of simple closed curves and it is intuitively obvious and sufficiently 
well known that R® must be homeomorphic to the plane. 

We turn now to a simultaneous consideration of the two cases. What has 
been done above is summed up in the following theorem. 


THEOREM 4. The group H may be so topologized that it becomes the ordinary 


1940] RIGID SPACE GEOMETRY 31 


rotation group of space, and it acts on R°, which is homeomorphic to space, as 
the ordinary rotation group does—in a properly chosen coordinate system. 


9. THEOREM 5. The set R® is closed and so coincides with E. 


The assumption that R° is not closed implies that there is an arc px which 
is contained in R° except for its end point which is not in R°. Since x is not in 
R° (and not in R) there is a sequence of elements g, of G, such that g,(x) tends 
toward infinity. Let GP denote all elements of G, leaving all of R® fixed. 
Under the homeomorphism taking G, to G, /G? =H suppose that g, goes to 
Zn; assume that the sequence Z, converges to Z and that 2 is the image of an 
element g under this homeomorphism. 

Let S; and S2 be two spheres each containing g(px) and such that S, lies 
inside S;. We may assume that all points g,(x) are outside S;. On the arc 
gn(px) there are two points g,(x,)and ga(yn), where x, and y, are points of px, 
such that g,(x,) is the first point of g,(px) on Sz and ga(y,) is the first point 
of g,(px) on S;. The points x, and y, lie on px in the order pxnaynx. 

There is no loss in assuming that the sequences Xn, Yn; Zn(Xn), and ga(yn) 
converge respectively to x’, y’, x* and y*. 

We wish to prove that x’ is identical with x. If it is not, x’ must be a 
point of R®° and the points x, may also be taken to be in R®°. Hence gn(x,) 
= %,(x,) converges to 2(x’) =g(x’) which is impossible because g(x’) is inside 
Se. Hence x’ and x are identical. 

This shows that the points x, converge to x and of course the points y, 
must also converge to x. There are, consequently, elements of G which take 
the pair (x,, yn) which is near (x, x) to the pair [gn(xn), gn(ya)] near to the 
district pair (x*, y*). There is then an element of G taking x to the two dis- 
tinct points x* and y*. From this contradiction the theorem follows. 


CoROLLARY. The group G” =H , tdle on all of R°, contains only the identity 
and the group Gy, 1s itself the rotation group of space. 


10. THEOREM 6. The group G is transitive on the space E. 


It has already been remarked that G(x) is closed for any x in E. In particu- 
lar G(p) is closed, and in order to prove our theorem it suffices to prove that 
G(p) is open. 

We know in the plane case as well as in the space case that G, is a proper 
subgroup of G and there must be an element g in G and a point g distinct 
from p such that g(p) =g. The set G,(q) is a sphere. Let g’ denote another 
point of G,(g) and let ¢ be a varying element of G, which takes g continuously 
to q’. There is a neighborhood U of g so small that its translation U’ to a 
neighborhood of qg’ has no point in common with its original position. The 
set U’ is the image of U under the terminal element of the parameter ¢. 

There is a neighborhood V of p so small that g(V) is inside U. As q is 


| 

| 

| 

| 

J 

| 

| 

| 


32 DEANE MONTGOMERY AND LEO ZIPPIN [July 


swept continuously to g’ it must come in contact with the g-image of every 
sphere about # and within V. 

Now let s be any point of V and let S=G,(s) be its sphere orbit under G,. 
There is a ¢ such that ¢t(qg) is in g(S), that is, 


t{g(p)} = = ge'(s) 


for some g’ in G,. Then 
tg(p) = s. 


In other words p may be carried to any element of V by some element in G. 
Therefore p is an interior point of G(p) and every point of G(p) must be an 
interior point. 

11. The fact that G is transitive on E tells us the nature of G, when x is 
distinct from p. Let x=g(p). Then G,=gG,g™ and G, is also the rotation 
group and in a proper system of coordinates “centered” on the point x it acts 
as the rotation group ordinarily does. 

12. Before such geometric concepts as lines and planes can be studied, we 
need to analyze the nature of G as a topological transformation group. It is in 
this that we encounter the principal difficulties implicit in our use of the 
“two-point” form rather than Hilbert’s “three-point” axiom. In much of this 
work we shall continue to treat the plane and space cases together. 


LEMMA 6. Let x, be a sequence of points converging to x and let g, in G be 
such that gn(xn) approaches x. Then for any z in E, the set gn(z) is bounded. 


Let O be the interior of an orbit S of G, which is so large that it surrounds 
z, all of the x,’s, and all of the points g,(x,). Let O* be the interior of a larger 
orbit S* so that S is interior to O*. Let zx, be arcs of O, and suppose now that 
for infinitely many of the elements g,, it is true that g,(z) is outside or on S*. 
For these n’s, and we take it now that all m’s are such, the arcs gn(x,z) have 
one point in O* and one point not in O*. They therefore have a first point 2, 
on S*. There is a point x, on x,2 such that z,, =g,(x,. ). We may suppose that 
Xa converges to a point x’ which is in O or S, while z,, converges to a point 2’ 
on S*. There must be an element g in G which leaves x fixed and carries x’ to 
z’. This is by 2.3 because g, takes the pair x, ) to the pair [ga(xn), 20 
However such an element g is in G, and hence leaves S and its interior invari- 
ant so that a contradiction has been reached. Hence g,(z) is compact because 
almost all its elements are inside S*. 

The orbit S* was subject only to the requirement that it surround S so 
that the following corollary is true. 


CoROLLARY. Let x, be a sequence of points converging to x and let g, be such 
that gn(Xn) =X. Let 2 be any point of E. Then any limit point of the set # als) as 
inside or on any orbit of G, which includes z and every Xn. 


q 


1940} RIGID SPACE GEOMETRY 


Our task now is the proof of the following theorem. 


THEOREM 6.1. Let x, approach x and let g,! be elements of G such that gl (xn) 
approaches y. Then there is a subsequence g,' of the g, and an element g* of G 
such that g,1' approaches g* (in the sense of pointwise convergence). 


The proof of this theorem is rather long and is based on a number of 
preliminary lemmas to which we now turn. Lemma 3.1 will also be useful 
here. As usual we use the letter g for an element of G and we use the letter h 
for a homeomorphism of E which is not known to be an element of G. Con- 
vergence of homeomorphisms, as the statement of the theorem implies, is al- 
ways taken in this section in the sense of pointwise convergence, that is, 
h, converges to h provided that, for each x, h,(x) converges to h(x). 


LEMMA 6.12. Let g, be a sequence of elements of G converging to a homeo- 
morphism h of E. Let F be an arbitrary compact subset of E and let a positive 
number e be given. Then there exists an integer N such that if y and z are any 
two points of F for which d(y, z)<1/N, and n>N then 


d[gn(y), h(z)] <e. 


The proof of this lemma which is quite similar to various preceding proofs 
will be omitted. 

In the hypotheses of the following lemmas it will frequently occur as it 
did in the preceding lemma that there is a sequence g, of elements of G con- 
verging to a homeomorphism h/ of E. From now on we shall express this fact 
in abbreviated form by writing g,—h. 


LEMMA 6.13. Let g,—h and let i be a positive integer. Then it is true that 
4 


The proof will be made by induction. It is true when i=1 and we now 
assume that it is true for 7—1. 

Let x be any point of E and let F be the set made up of the points x, 
h(x), and gi*(x), (n=1, 2,--- ). By the hypothesis of the induction F is 
compact. Let e be any positive number. By the preceding lemma there is an 
integer N such that if y and z are in F and d(y, z) <1/N then for »>N 


d[gn(y), W(z)] < e. 
On the other hand for sufficiently large , say m greater than N’, 
dlgn < 1/N. 


Hence if is larger than (N, N’) we obtain (letting y be gi '(x) and z be 
h*-*(x)) 


d[gn(x), h'(x)] <e. 


33 

| 

| | 

| 

| 

| 

| 

{ 

| 


34 DEANE MONTGOMERY AND LEO ZIPPIN [July 


Lemma 6.14. Let g,—h and let g be an arbitrary element of G. Then gg,—gh. 


For an arbitrary x we know that g,(x)—>h(x). It is then an immediate con- 
sequence of the definition of homeomorphism that gg,(x)—gh(x). 


LEMMA 6.141. Let g,—h and assume that there are elements g and g' of G 
such that ggn,—g’. Then g’ =gh and his in G. 


By the preceding lemma gg,—gh and hence g’ = gh. 


LEMMA 6.15. Let K be a simple closed curve and let T be a nonidentical 
sense preserving homeomorphism with a fixed point. Then there exists a pair of 
points x and y of K such that T‘(x) =x and T‘(y)—x as io. 


Choose any moving point y. Then T‘(y) must approach monotonically a 
point x which is fixed. 


LemMaA 6.151. Let gn—h and suppose that, for a definite point q of E, 
h(q) =q. Then h must leave invariant every orbit of G,, that is, for all x, 


hG (x) = G(x). 


Let x be any point. Since g,—h we know that g,(x)—A(x) and ga(q)—h(q) 
=q. Hence there is an element of G taking q to gq and x to h(x). This element 
is in G, and we have thus shown that hG,(x) is in G,(x). The equality must 
hold because of the nature of G,(x) in the two cases. 


LEMMA 6.152. Let x, be a sequence of points converging to a point x of E. 
Let g, of G be such that gn(x,)—x. Then there exists a homeomorphism h of E 
such that g,, —h for a subsequence gn of gn. 


The proof here is similar to that of Lemma 4, but it depends also on Lem- 
ma 6. 


LEMMA 6.16. Let x,—x and let g,’ be such that gn (xn) =yn—y. Let g (in G) 
be such that g(y) =x. Then gn=ggn has a subsequence g,.' which converges to a 
homeomorphism h and h(x) =x. 


Since g(yn) (y) =x. Then ga(xn) =ggn (xn) =g(yn)—>x. By the pre- 
ceding lemma there is a subsequence g,/’ and a homeomorphism h such that 
gn'—h. By Lemma 3.1, (xn) and hence h(x) =x. 

We are now ready to prove Theorem 6.1 for the plane and we restate it 
here for this case. 


THEOREM 6.1’. Let x,—x and let g,, be elements of G such that g,! (xn)—>y. 
Then there is a subsequence g,''’ of the g. and an element g* of g such that 
gn '—g*. (For E>.) 


Let g,’ and h be as in Lemma 6.16. The elements g,.’ are sense preserving 


1940] RIGID SPACE GEOMETRY 35 


and consequently h is sense preserving. For all z distinct from x, G,(z) is a 
simple closed curve. Since hG,(z) is in G,(z) and since h is a homeomorphism, 
hG.(z) =G,(z) and h is sense preserving on G,(z) (see Lemma 6.151). 

For convenience let K =G,(z) for an arbitrary point z distinct from x, and 
let z’ =h(z). Since G, is transitive on K, there is a g’ in G, such that g’(z’) =z, 
that is, g’h(z) =z. By a previous lemma g’g,/’ —g’h. Since g’ is in G,, 


gih(x) = g'(x) = x. 


It follows that h’ =g’h preserves the orbits of G, and in particular it follows 
that h’ is a sense preserving homeomorphism of K. Moreover h’(z) =z. 

We wish to prove that h’ is the identity on E and we begin by proving 
that h’ is the identity on K. 

If h’ is not the identity, then there is a pair of distinct points x* and y* 
such that h’*(x*) =x* and h’‘(y*) approaches x* as i approaches infinity. For 
a fixed 7 we know that g,*'—h’* where g,* =g’g,1'’ g'h=h’. In view of these 
two facts we see that g,** for some m and 7 takes x* and y* into a specified 
neighborhood of x* which violates the two-point condition. Hence h’ is the 
identity on K. 

We shall show next that the set of points of E—x on which h’ is the iden- 
tity contains only inner points. Let 2’ be any point in E—x which is fixed 
under h’. Since h’(z’) =z’, h’ must leave invariant the orbits G,-(z*) for any 2* 
in E. Let K’=G,(z’). By the argument above we see that h’ leaves every 
point of K’ fixed. Let U be a neighborhood of 2’ so small that points of K’ lie 
outside of U and let V be a neighborhood of z’ which is invariant under G, 
and is contained in U. For any point 2* of V the simple closed curve 
K*=G,(z*) must contain a point of K’. This is a fixed point of h’ and the 
argument of the preceding paragraph shows that every point of K* is fixed 
under h’. Then it follows that all of V is fixed under h’. 

Hence the set of points fixed under h’ is both open and closed in E—x. 
This set of fixed points therefore includes E—<x and since it is closed it must 
include E. Hence h’ is the identity. 

Now h’=g’h so that h=g’— and h is an element of G,. This completes 
the proof of the theorem for the case of the plane. To see this assume for 
convenience that g,’ and gg,, coincide. Then gg, and g, 

We turn next to the proof of the theorem for E3. 


THEOREM 6.1’’. Let x,—>x and let g,! be elements of G such that gx (xn)—y. 
Then there is a subsequence gx ' of gn and an element g* of G such that gx'' —g*. 
(For 


Let gx’ and h be as in the preceding proof, that is, as in Lemma 6.16. 
The transformations g,/’, and therefore h also, are sense preserving in E. The 
point x is fixed under h and h preserves orbits G,(z), these orbits being two- 
spheres. Let z be a definite point of E distinct from x and let S=G,(z). Since 


36 DEANE MONTGOMERY AND LEO.ZIPPIN [July 


h, being sense preserving, has at least one fixed point on S we may assume 
that z is such a fixed point. Then h will preserve orbits G,(y) for all y of E. 

We are going to look for some simple closed curve K, on S, invariant 
under G, and G, and also invariant under h. Let S’ denote a sphere orbit 
under G, such that some points of S are outside S’. Let F denote the closed 
intersection of S and S’ and let D denote the component of S— F which con- 
tains 2. 

Now S and S’ are invariant under hf and their intersection F must also 
be invariant under hk. Hence S— F is invariant and D must be invariant under 
h since it is a component with a fixed point. Let K be the boundary of D 
on S. It is this set which will be shown to be a simple closed curve. Observe 
at the moment that K is invariant under h and is a subset of F. 

The group G,, leaving x and 2 fixed is a circle group and F is invariant 
under G,,. Let y denote any point of K above. Then K =G,,(y). Hence K isa 
simple closed curve invariant under h. Furthermore h in its action on K must 
be sense preserving for otherwise it would have to interchange the two com- 
ponents of S—K which it cannot do since we know that D contains a fixed 
point. 

Since G,, is transitive on K there is an element g’h=h’ which has all the 
properties of h which we need to use and in addition has a fixed point on K. 

We wish to show that h’ is the identity on S ultimately enlarging the set 
of fixed points to include all of E. By a familiar argument h’ must leave all 
of K fixed. The following lemma will be useful to us as we proceed. 


LEMMA 6.17. If h’ leaves fixed a point p and a continuum on an orbit 
S* =G,(q), then h’ leaves all of S* fixed. 


Let F be the set of fixed points of hk’ on S*, and let M=S*—F. Let O 
be a component of M, and let B be the boundary of O. Now from the hy- 
pothesis B cannot be zero dimensional, for if it were B would be all of F and 
so F would not contain a continuum. Hence B is one dimensional, and it must 
contain a continuum C which contains a point b which is accessible from O. 
Let U be a neighborhood of 6 (in S*) so small that any simple closed curve in 
U surrounding b must meet C. Let G,(y) be an orbit so small that its inter- 
section with S* is in U. Then by an argument given above there is a simple 
closed curve in the intersection of G,(y) and S* which is invariant under h’. 
Furthermore hk preserves sense on this curve. Since this curve surrounds } it 
must meet C and consequently h’ must leave all the curve fixed. As this curve 
was arbitrarily small we see that b cannot be accessible from O. This is a 
contradiction which proves the lemma. 

We can of course conclude now that h’ leaves all of S fixed. Furthermore 
any small “sphere” with center on S will also be left fixed by h’ since it will 
intersect S in a one dimensional set. The set of “spheres” about x forms a ray 
(the end point of the ray being of course a point orbit). The above considera- 


1940] RIGID SPACE GEOMETRY 37 


tions show that the “spheres” of this ray left entirely fixed by h’ form a set 
which is both open and closed. Therefore h’ leaves every point of E fixed and 
is the identity. 

Now as in the plane case h=g’h and h=g’—" so that h is in G and indeed 
in G,,. This completes the proof as in the former case. 


12.1. LEmMaA 6.2. If gn—g, then 


Let x be any point of E. By hypothesis g,,(x)—>g(x). Also gng~!(x)—>-gg-!(x) 
=x. If we knew that ggz!(x)—>x we could conclude that g,'(x)—-g-!(x). The 
proof of the lemma is therefore reduced to the proof of the following special 
case. If g, is in G and g,(x)—x, then g,*(x)—x. 

In order to prove this proposition let S be any sphere about x. We shall 
show that almost all the points g,-1(x) are inside S. If this is not true we arrive 
at a contradiction as follows. For each g,'(x) not inside S choose a short arc 
joining x to gn(x), call it xg,(x). Then g.-*(xg,(x)) will be an arc joining g,-1(x) 
to x. It will therefore contain a point y, on S. Now g, takes the pair (yn, x) to 
a pair (gn(vn), Zn(x)) both elements of this pair being near x. The points y, 
will have a limit y on S and there will have to be a g in G taking both y and x 
to x. This contradiction establishes the lemma. 


Lemma 6.3. If g.—g and g, then gag. gg’. 


Let x be any point of E. Then g,! (x)—»g’(x). Hence ga[g,/ (x) | gg’ (x) 
which we wanted to prove. 


THEOREM 6.2. Let (a1,---, ae) and (A1,---, Ax) be any two sets of k 
points. Let (a{,---, ay) approach (a1,---, a.) and let (Af,---, Az) ap- 
proach (A1,---, Ax). If there is for each n an element g, in G such that 


=A is 


then there exists an element g in G such that 
g(ai) = Aj, 


Let be an element of G such that g,! (a;) Then gag, (a;) = 
Letting 7 = 1 we see that g,g, (a1) —>A1. Since a; certainly approaches Theo- 
rem 6.1 tells us that the sequence g,g, contains a subsequence converging to 
an element g* in G. We assume that our original sequence above is taken as 
this convergent subsequence. An application of Theorem 6.1 to the sequence 
gn (remembering that (a1) =a{—>a,) shows that for a subsequence of 
(which we now take for the whole sequence) g,/ —g’ where g’ is in G. We have 
now arrived at the following situation: 


Bn —> 8". 
By the preceding results g,/—'g,-'+g*-!. Hence 


#=1,---,hk. 


DEANE MONTGOMERY AND LEO. ZIPPIN 


Now for any z (1 <i<k) 


n n 
A; = gn(ai) (ai). 


But A?—A, and hence 
g*g’"(ai) = Aj. 


The element g*g’—! therefore has the desired properties. 

This theorem applied to the plane case (and with k=3) yields Hilbert’s 
Axiom III. It is worth pointing out that we have actually proved a great deal 
more than Hilbert’s axioms; we have also obtained many of the results of 
his paper. But as we are, at the moment, only interested in establishing that 
our weaker axioms suffice for the plane, we leave the plane case and turn our 
entire attention to the three dimensional case. From now on it is to be under- 
stood that we are dealing with this latter case. 

13. By a G-straight, or more simply a straight or a line we shall mean a 
topological line which is the set of points left fixed by a circular subgroup of 
some G,. Through every point of space there is clearly a large family of 
straights. Now let x and y be any two distinct points of the space. In the group 
G, there is one and only one circular subgroup leaving y fixed. Let this group 
be called K.,. It clearly leaves fixed a topological line, call it L.,. We have 
therefore seen that there is at least one G-straight through every two distinct 
points of space. We know that L., is the set of all fixed points of K.,. We have, 
therefore, the following theorem. 


THEOREM 7. Through each pair of distinct points of E there passes one and 
only one G-straight. 


13.1. It is worth noting that if an element g in G leaves fixed three points 
x, y, 2 not all on the same straight then g leaves all of Z fixed. For since g 
leaves x fixed it is in G,. Since it leaves y fixed it is in the circular subgroup 
K., of G,; and since it leaves fixed z, a point not on the “axis” of K.y, it must 
leave all of E fixed. 

The symbol xy will be used to denote the closed portion of the line L., 
which is contained between the points x and y. This set of points will be called 
an interval or the interval xy, or a segment. 

Let L., be a straight, left fixed by the circular group K.,, and let g be 
any element of G. Then the set of points g(L.,) is the set left fixed by gK.,g—'. 
In other words the image of a straight under any element of Gis also a straight. 
It follows that the image of a segment is a segment. 

Two configurations of the space E are said to be congruent if one of them 
is carried into the other by some element of G. 

Any two straights are congruent and in fact any two marked straights are 


38 CCCs [July 


1940] RIGID SPACE GEOMETRY 39 


congruent. By a marked straight we mean a straight with some one of its 
points particularly “marked.” It is furthermore true that any marked straight 
can be taken to any other so that a given direction on the one goes to a given 
direction on the other. These facts follow from the transitivity of G and the 
nature of the rotation group. 

14. As we have said, a sphere is defined to be any two dimensional orbit 
of any group G,. The point x is called the center of the sphere. 


THEOREM 8. A straight and a sphere can intersect in two points at most. 


Let x be the center of S the sphere and let L be the straight. Suppose a 
and b are two points of intersection of L and S. There exists an element of G, 
which interchanges a and b. This element must carry L into itself since L is 
determined by any two of its points. Then there exists a non-trivial subgroup 
of G, which leaves LZ invariant. This subgroup which will be denoted by Q 
is compact. Such a compact group acting on a line can contain only two dis- 
tinct transformations, the identity and a reflection. Hence under this group 
the orbit of a consists of the two points a and b. It follows that there can be 
no other point c of ZL on S, for otherwise there would be an element of Q inter- 
changing a and c and a would have at least three points in its orbit. This com- 
pletes the proof. 


THEOREM 9. If p and q are inside a sphere S, then the segment pq is inside S. 


The straight L,, is not compact in either direction. In going along Lp, 
from p to g and on we must meet S in some point. Similarly we must meet S 
in going from g to p and on out. The straight L,, meets S, then, in two points 
neither of which is in the interval pg. Therefore no point of the interval can 
be outside S for this would imply that some point of the interval was on S 
and this would mean that L,, had at least three distinct points on S. 


15. THEOREM 10. If x,—x and y,—y then xnyn—Xy. 


Let z,, n=1, 2, 3,---, be a point of x,y,. Any sphere surrounding xy 
surrounds almost all the z,, so that for a proper subsequence of the m and a 
suitable point z, z,—>z. We have to prove that z is on xy. There exist elements 
gn such that g,x,=x, and gny, is on L,, on the same side of x as the point y. 

Now we know that there is a subsequence of the g, and an element g such 
that g,—g. Further gx =x, gnyn—gy, and gn2n—g2. Now gnyn is on Lz, so that 
gy must also lie on L,,. On the other hand gy belongs to G,(y). Then it is clear 
that gy=y. Therefore g belongs to K.,, g~' belongs to K., and z=g~'(gz) is 
a point of L., since gz is a point of Ly. It is now a trivial matter that z is on 
xy and that every point of xy is a limit point of some sequence 2,. 

It should be remarked that under the same hypothesis the line L,,,, con- 
verges to the line L.,. By this we mean that every sequence z, of points from 


40 DEANE MONTGOMERY AND LEO,ZIPPIN [July 


Lz,y, either has no limit point or every limit point which it has is on Luy. 
Furthermore every point of L., is a limit point of such a sequence. 

Let us now take a point 2,, »=1, 2, 3,---,on Lz,,, and assume that the 
sequence Zz, converges to a point z. The intervals x,z, then converge to xz 
and y,Z, converge to yz. It may be assumed that the points z, are outside the 
interval x,y, say in the order xnyn2n. The segment xz then contains xy and yz, 
so that xz is clearly part of the line Lzy. 

The argument that every point of L,, is such a limit is not difficult. 


16. THEOREM 11. Let x and y be any two points of the sphere S. Then the 
interval xy, except for x and y, is inside S. 


Let x, and y, be sequences of points inside S converging respectively to x 
and y. By a preceding theorem the intervals x,y, must be contained in the 
interior of S. The limit of these intervals will then be inside or on S. This 
limit is xy. This shows that xy is entirely contained in S and its interior. But 
L,, can have no point besides x and y on S. The conclusion therefore follows. 

17. A point y is said to be the midpoint of the segment or interval xz if it 
is on xz and if there is an element of G leaving y fixed and interchanging x 
and z. The point y is then the center of a sphere containing x and z as anti- 
podal points. 


THEOREM 12. Every segment has a unique midpoint. 


We begin by showing the existence of the midpoint. Let y be a variable 
point of the segment xz. There is in G, an element of order two which moves x 
to a unique point of L,,. Let this unique point be denoted by f(y). Now 
f(x) =x; and f(z) lies on L,, and has the order xzf(z). Therefore if f(y) is con- 
tinuous it will assume for some y the value z. We have only to show therefore 
that f(y) is continuous. 

Let y, approach yo. Lemma 6 shows that f(y,) is a bounded set, and we 
may assume that f(y,) approaches a point w of the line L,,. We wish to show 
that w=f(yo). Assume that this is not true and that w is distinct from f(yo). 
The pair (x, yn) is carried by an element of the group to the pair [f(yn), ya. 
There must be an element g of the group taking the pair (x, yo) to the pair 
(w, yo). There is also an element g’ taking the pair (x, yo) to the pair (f(yo), yo). 
That is 


g(yo) = g’(yo) = Yo 


and 
g(x) = w, g(x) = f(y). 


Hence g’g—! leaves yo fixed and takes w to f(yo). This is impossible because 
g’g-! is in the compact group G,, and the points w and f(yo) are both on the 


| 
) 
oye 


1940] RIGID SPACE GEOMETRY 41 


same side of yo on the line L,,. Hence f(y) is continuous and the midpoint of xz 
exists. 

Assume now that there are two midpoints y and y’ of the segment xz. 
Each of them gives rise to a reflection interchanging x and z and leaving itself 
fixed. The product of these two reflections of L,, is a transformation of L., 
leaving x and z fixed and moving other points on the line. This is impossible. 

18. The next geometric concept to be defined will be the projection of a 
point z on a line L. If z is on L this projection is defined to be 2 itself. If z 
is not on L let g be an element of order two in K, which is the circular group 
leaving every point of L fixed. Let 2’ =g(z); evidently g(z’) =z. The line L,,, 
and the segment zz’ are invariant under g, so that g is a reflection of the line 
and segment with a fixed point p. Since this point p is fixed under an element 
of K not the identity, it is fixed under all of K and is on L. The point p is 
now defined to be the projection of z on L. 


THEOREM 13. The projection of z on L is a continuous function of z. 


Suppose 2, converges to z. Then g(z,) =z, converges to a point 2’, g being 
the element of order two in the group leaving L fixed. The segments 2,2, 
converge to the segment zz’. The points ~, have some limit point on 22’. 
But a limit point of the ~,’s must be fixed under g and can only be the point p. 

19. The space E is given to us as a metric space. Our purpose now is to 
introduce a new metric equivalent to the old which will be invariant under G. 

Let L be any straight in E and let G* be the set of elements of G which 
transform L into itself while preserving direction. Any element of G* which 
leaves a point of L fixed leaves every point of L fixed. Let g, be a sequence of 
elements of G* such that for some point a in L the sequence g,(a) approaches 
a. Then for any bd in L the sequence g,(b) approaches b. It follows from the 
fact mentioned here that if two homeomorphisms of G* act approximately 
the same way on a single point of space then they act approximately the same 
way over any bounded part of the line. 

It will be useful to note also that if ab is an interval of Z and g takes L 
into itself with direction reversed and if g(b) =a it follows that g(a) =b. With 
the aid of G* we shall now see how L becomes the carrying space of a topologi- 
cal group. 

19.1. Let obea point of L fixed but arbitrary. Let a and b be any two points 
of L. Let f be an element of G* which moves 0 to a, and let g be an element of 
G* which moves 0 to b. Then by definition a-b =fg(o). This definition of multi- 
plication on L is associative. It has an inverse, and hence all the group axioms 
are satisfied. If f(0) =a, the inverse of a is f-'(o0). Another way to obtain a-! 
is to define it as the position to which a goes by an element which reverses 
direction on L while leaving o fixed. The latter definition shows that the oper- 
ator “inverse” is continuous. 

Now assume that a, approaches a and approaches Let f,(0) 


42 DEANE MONTGOMERY AND LEO ZIPPIN [July 


£n(0) =bn, f(0) =a, and g(o) =b. By the remarks in §19 f, is near f for large n 
and any bounded portion of the line. A similar statement may be made about 
g,and g. Therefore f,ngn(0) is near fg(0). This shows that the group multiplica- 
tion a-b is simultaneously continuous in a and b. 

Thus with multiplication as above defined L becomes the carrier of a 
topological group. But it is known that such a group must be bicontinuously 
isomorphic to the additive group of real numbers. The line L may now be 
metrized with the metric of the real numbers carried by it in this way. That 
is, if x and y are any two points of L, then d*(x, y), the new distance, is defined 
to be the absolute value of the difference of the real numbers corresponding 
to x and y. 

Now suppose that the segment xy is translated to x’y’ by an element g 
of G*. Thus: 


g(x) =a’, = 9’. 
If g(o) =z, then 


Therefore x-y is translated to x’-y’ by an operation of the topological group 
defined above and therefore d*(x, y) =d*(x’, y’). 

Any element of G which reverses sense on L also preserves the new dis- 
tance. This is because there is one sense reversing transformation which 
merely changes the ends of a given interval. This transformation leaves the 
length of the interval invariant. Any other sense reversing transformation is 
the product of this one and an element of G*, from which our statement now 
follows. 

19.2. We may now define the new distance for any two points x and y of 
space. There is some element in G which carries x and y to two points x’ 
and y’ of L. The distance d*(x, y) is defined as equal to d*(x’, y’). By its very 
definition this new distance extended as it now is to all of E is invariant'un- 
der G. In fact any two pairs of points are congruent if and only if they have 
the same distance d*. 

19.3. To show that the new distance d* is equivalent to the old it is only 
necessary to show that it is a continuous function. 


THEOREM 14. The distance d*(x, y) is a continuous function of x and y. 


Let x,’s converge to x and y,’s to y. We saw in the proof of Theorem 10 
that G contains an element which takes x, to x and y, to the line L.,, near y. 
But on L and hence on Ley, d* is a continuous function. This shows that d* of 
the transformed pair, which is the same as d* for the original pair (xn, yn), 
is near d*(x, y). 


19.4. THEOREM 15. The distance d*(x, y) satisfies the triangle axiom. 


° 


1940] RIGID SPACE GEOMETRY 


Let A, B, and C be any three points of E. We wish to prove that 
d*(A, C) + d*(C, B) = d*(A, B). 


If Cis on the line Laz we know this. When C is between A and B the equality 
holds, and this will be the only case of equality. 

Suppose now that C is not on Lz and let C’ be the projection of C on the 
line. To prove the desired inequality it will suffice to show that d*(A, C) 
>d*(A, C’) and d*(C, B) >d*(C’, B). This follows from the following lemma. 


LEMMA 7. Let x, y, and 2, not on a line, be such that z is the projection of y 
on Lz. Then d*(x, y) >d*(x, 2). 


Let S be the sphere about x which goes through y, that is, the sphere with 
center x and radius d*(x, y). Let g be the element of order two in K,, and let 
y’ =g(y). The segment yy’ has z as midpoint, so that certainly z is on yy’. 
The point y’ is on S because xy and xy’ are congruent under g. If now 
d* (x, 2) =d*(x, y), there would have to be a point 2’ on the segment xz such 
that d*(x, 2’) =d*(x, y). Since y and y’ are on the sphere, 2 is inside it. Of 
course x is inside the sphere and hence 2’ is also inside it, contrary to the pre- 
ceding equality. 

It should be noted that all points of yy’ except y and y’ are inside S so that 
for any point w of yy’ distinct from y and y’, d*(x, w) <d*(x, y). 

It is clear that we have established the metric characterization of a 
straight line: three points are on a straight line if and only if their distances 
in proper and unique order satisfy the triangle equality. From this point on, 
the distance d*, now called d, will be the only distance used. 

20. Let Z and L’ be two lines having in common the single point p and 
such that there exists an element of order two in Kz, under which the line L’ 
is invariant. In this case L’ is said to be orthogonal or perpendicular to L. If x 
is a point on L’, its projection on L is the point p. 

Consider the rotation group G, and the coordinate system that goes with 
it which makes G, the three-space rotation group. Since there is an element of 
order two, that is, a half-rotation in G, which leaves L fixed and L’ invariant, 
there is also a half-rotation in G, which leaves L’ fixed and L invariant. In 
other words L is also orthogonal to L’ and the relation of orthogonality is 
symmetric. It can be seen also that orthogonality is a group invariant. 

20.1. The transitivity of G and the nature of the rotation group enables us 
to state the following two theorems. 


THEOREM 16. Let L and N be two lines both orthogonal to a line M ai a 
point p. There is then an element of Ku which carries L to N. 


This makes it clear incidentally that the locus of points on lines orthogo- 
nal to L at p is a topological plane. 


43 


44 DEANE MONTGOMERY AND LEO ZIPPIN [July 


THEOREM 17. Let L and L’ be orthogonal at p. and let M and M’ be orthogo- 
nal at q. There is then an element g in G such that g(p)=q, g(L)=M and 
g(L’)=M’. Furthermore this element may be so chosen as to take any desired 
directions on L and M to any desired directions on L’ and M’. 


20.2. Let L and L’ be a pair of orthogonal lines intersecting in a point p. 
Let x and y be two points of L’ on opposite sides of p and at equal distances 
from p. A half-rotation about Z must then interchange x and y. If g is any 
point of L, the half-rotation carries the segment gx to gy, so that qg is equidis- 
tant from x and y. We may express this result by saying that if two lines are 
orthogonal any point on one of them is equidistant from any pair of sym- 
metrically placed points on the other. 

20.3. A triangle is defined in the natural way as the system of three seg- 
ments joining pairs of a set of three points. Other simple geometric concepts 
will sometimes be used without definition when the definition is perfectly 
straightforward. 

As a sort of converse to the result of the preceding section the following 
theorem is given at this point. 


THEOREM 18. The altitude of an isosceles triangle bisects the base. 


Let g, x, and y be three points such that gx and qy are congruent. Let p 
denote the projection of g on L.,. Then L,, is orthogonal to L.,. Consider 
the half rotation which leaves L,, fixed and L., invariant. This carries x 
and y to x’ and y’, all four points being on the same line. All four segments: 
qx, gy, qx’, and gy’ are equal and the line L,, must meet the sphere G,(x) 
in four points, x, y, x’, and y’. At most two of these are distinct and we know 
that x and y are distinct. This makes it clear that y’ is x and x’ is y, so that p 
is the midpoint of the segment xy. 

21. Let Z denote a line and » a point on it. Let w denote the set of all 
points of space whose projection on L is the point p. Such a set, by definition, 
is a plane of our geometry. It is clear that any line M orthogonal to LZ at the 
point p belongs to 7, and that every point of 7 is on one such line. Now let 
x and x’ be a pair of points symmetrically situated about on the line L. 
Then all points of 7 are equidistant from x and x’: conversely any point 
equidistant from these must lie on 7. Our planes may therefore be character- 
ized as the locus of points equidistant from some pair of points. If we consider 
the circle group which leaves fixed the line L = L,., we see that every line M, 
as above, is generated by this group from any arbitrary one. It follows at 
once that our planes have the topological structure which they should. 

21.1. To show that our planes are linear sets we shall borrow from 
Kerékjarté the notion of introducing a simple antipodal transformation a, 
not an element of G, defined on the whole space as follows. Under a@ the point p 
is fixed: a point g goes to that point q’ on the line pq which is symmetrically 


i 


1940] RIGID SPACE GEOMETRY 45 


disposed about p. It is obvious that the straight lines through », and only 
these, are invariant under a. 

Our metric is invariant under this transformation a. To see this, following 
Kerékjart6, we need merely show that, for an arbitrary pair of points, a co- 
incides with an appropriate element of G. To this end, let g and s be two 
points, distinct from p, and let L’ denote a line orthogonal to pq and ps. 
Let g denote a half-turn about L’. It is clear that g and s have the same images 
under this half-turn as under the antipodal transformation a. It now follows 
immediately from the invariance of our metric that @ carries straight lines of 
space to straight lines, for these as we have seen are metrically characterized. 

21.2. Consider now the line Z and the plane z orthogonal to it at the point 
p. Perform upon space the antipodal transformation a followed by a half- 
turn about L. It is clear that the points of 7 are fixed points under this 
product-transformation. But it is important to observe that they are the only 
fixed points: this follows from the fact that all lines through p are invariant 
under a and only those remain invariant under the half-turn which are or- 
thogonal to L at p. Now take two points, g and s, in 7. The line through these 
must be invariant because our composite transformation carries it to a line 
through g and s, since these are fixed points. This transformation, moreover, 
preserves distances. Since it has a pair of fixed points it must leave all of the 
line fixed. Therefore the line through g and s must lie in 7, the locus of fixed 
points. Then we have shown that with every pair of its points our plane con- 
tains the line determined by these points. 

22. As we know, the plane = is invariant under the circle group K,. We 
want to show next that associated with every point q of 7 there is a similar 
circle group leaving 7 invariant. Such a step will enable us to see that there is 
nothing special about the point p and to conclude that z is either a euclidean 
or hyperbolic plane under the subgroup G* of G which leaves 7 invariant. 
This group is transitive on 7. 

22.1. Let us introduce the notion of the projection of a point x on the 
plane 7. This is defined as that point x’ of 7 which is nearest to x. In order 
to see that x’ is determined, let z denote a point of r and let S denote a sphere 
with center at x and radius xz. The solid sphere intersects 7 in a compact set. 
For any point of 7 not in this intersection the distance to x must exceed xz. 
On the compact set there certainly is one “nearest” point, but conceivably 
more than one. Now there can be at most one such nearest point. For sup- 
pose x’ and x’’ are two points of w at the same distance from x. Then the mid- 
point of x’x’’ is in w and is nearer to x than x’ and x”’ are. It is clear from the 
continuity of distance and the uniqueness of projection that this projection 
operation is continuous. We shall use this continuity in the following theorem. 


THEOREM 19. Every point q of x is the projection of at least one point q’ 
not on 


46 DEANE MONTGOMERY AND LEO ZIPPIN [July 


Consider a sphere about P large enough to have the point gq inside of it. 
This sphere meets 7 in a circle, call it C. Let S denote one of the hemispheres 
associated with C. Let S* denote the set of projections of S. Since C can be 
deformed on S to a point of S, it can be deformed on S* to a point of S*. 
During this deformation it must meet g since g is in the domain bounded by C. 
This means that q is a point of S* as was to be shown. 

22.2. Let g’ be a point not on z and let gq, distinct from p, be its protection 
on 7. Let 7’ denote the plane of lines orthogonal to gq’ at g. This plane con- 
tains all straight lines of which it contains a pair of points, by 21.2. 


THEOREM 20. The planes and rr’ are identical. 


It will first be shown that 7’ includes 7. Consider the line L,,. This line 
is in both planes. The coordinate system around P shows us that there is one 
and only one line in 7 which goes through # and is orthogonal to Ly,. Let L’ 
denote any line, distinct from this one, through p and in 7. Let s denote the 
projection of g on the line L’. The point s is distinct from » by our choice of L’. 
Since s and g are points of 7, L,, belongs to 7. Since it is a line of + which 
goes through q it is also in 7’. Therefore s and p are both points of 7’ and Ly, 
belongs to 7’. The set of points on such lines is dense in 7, and the closure 
of this set, which is 7, must also belong to 7’. 

We will now show that 7 contains 7’. The plane z has the property that 
each of its points is interior to a two-cell, and it must therefore be an open set 
in 7’. On the other hand it is a closed subset of space and is therefore closed 
in 7’. Then it must coincide with 7’. 

23. Let (7, p) denote a marked plane, that is to say a plane 7 with some 
one of its points p particularly specified. 


THEOREM 21. Any two marked planes are congruent. 


Let (7, p) and (¢, s) be the two marked planes. Let Z denote a line or- 
thogonal to z at p. Such a line shall be by definition a line through p orthogo- 
nal to every line of 7 through L. This orthogonal line always exists and is 
unique, for it is the locus of points p’ which project on p when they are pro- 
jected on 7. This we will see as follows. Let L* be the totality of points 
projecting on p. As we know L* must contain at least one point p’ distinct 
from p. Hence from previous considerations L* must be the line L,,:. 

Now let L’ denote a line orthogonal to o at s. The marked line (L, p) 
may be carried to the marked line (L’, s). Since these lines completely de- 
termine 7 and a, (7, p) must go into (¢, s) by the element which takes (L, p) 
to (L’, s). If we carry the line L’ into itself by any half turn about the point s, 
the plane o must go into itself with orientation reversed. The marked planes 
are therefore congruent with a matching of any orientations that we choose 
on them. 

24. The geometric concepts have now been analyzed sufficiently for us to 


ioe 
a 


1940] RIGID SPACE GEOMETRY 47 


be able to see that they satisfy the axioms usually given for a geometry, with 
the exception of the parallel axiom. Axioms I, II, III, and V as given by 
Hilbert [4] for example are all satisfied. 

It is clear that there are two possible geometries satisfying our axioms for 
the space case, the euclidean and the hyperbolic. We now sketch rapidly one 
method for seeing that there are not more than two. 

Let 7 be any plane in E. The subgroup of G which takes 7 into itself and 
preserves orientation on 7 can be seen to satisfy the axioms of Hilbert’s paper 
[5]. The plane is therefore either euclidean or hyperbolic. 

Since all planes are congruent to any given plane, we see that either every 
plane is hyperbolic or every plane is euclidean. We wish to show that the 
geometry induced by G is either euclidean or hyperbolic according to the char- 
acter of the planes. 

Let (E, G) and (E’, G’) be two systems satisfying all the axioms for the 
space case and assume that in the two systems planes are of the same char- 
acter. 

Let (7, p) be a marked plane in E and let (’, p’) be a marked plane in E’. 
Let H denote a congruence correspondence between these two planes. Let L 
and L’ be the unique lines of E and E’ which are orthogonal to 7 and 7’ 
at p and p’. The unit of length gives us a unique correspondence between L 
and L’, the only choice, and it is an arbitrary one, being which half of Lis 
mapped on which of L’. Assume that this choice has been made so that we 
have really chosen an upper and a lower half for Z and also for E’. 

We can now choose coordinates in E and E’ and extend H by letting points 
with the same coordinates correspond. The correspondence H as thus ex- 
tended is isometric. It preserves segments, orthogonality, lines, planes, and 
in fact all geometric concepts. The function H also associates with every g 
in G an element g’ in G’ and we are therefore led to the conclusion that (E, G) 
and (Z’, G’) are equivalent provided the planar character of the two systems 
is the same. 


APPENDIX 


For the space case an alternative set of axioms might be chosen as follows. 
Let (E;, G) be a system consisting of a set G of sense preserving homeo- 
morphisms of E;. Let the following axioms be satisfied. 


2.1’’. The same as 2.1’. 


2.2’’. There tis a point p such that G, satisfies the conditions (a), (b), and (c): 

(a) G, is a proper subgroup of G. 

(b) For each x distinct from p, G,(x) contains at least three points. 

(c) For a sequence of points p, approaching p, Gy(pn) is at least one dimen- 
sional. 


2.3’’. The same as 2.3’. 


48 DEANE MONTGOMERY AND LEO ZIPPIN [July 


We will show that these axioms imply 2.2’ from which it follows that they 
suffice for the foundation of space geometry. 

Quite as in the paper we can arrive at the situation of §5. We have then 
an open set R; with a closure Ri, and H; is a compact effective transformation 
group of R;. The orbits of H; are the same as those of G,. 

Let H# be the component of the identity of Hi. The set Hi*(x) has the 
same dimension as Hii(x) and H,; must have orbits of dimension at least one 
in Ri. 

Assume now that H# is one dimensional. Then H# is the circle group. As 
H leaves fixed an “axis” of points of Ri, not every x distinct from p has 3 
points in its orbit under H#*. Therefore H# does not exhaust Hi. But H# di- 
vides R; locally (near p) into a decomposition space which is essentially a 
half plane. H1—H#* must act on this half plane and the only compact group 
which can act on a half plane is a group of order two which reflects its edge. 
Hence in any case points on the “axis” of H¥* will have orbits of at most two 
points under H;. This shows that H; cannot be one dimensional. But since 
it is now seen to be of dimension greater than one it must also have orbits of 
dimension greater than one by arguments in our earlier papers. This con- 
cludes the reduction of the present system of axioms to those of this paper. 

It should be remarked here that in the presence of condition (b) above 
it is altogether likely that condition (c) can be relaxed perhaps merely to 
assert, with Hilbert, that G,(p,) is infinite. This appears to have all of the 
difficulty which attends the problem of showing, if it is true, that a zero 
dimensional topological transformation group of three-space is necessarily 
finite. 

We might mention in conclusion that it seems to us of some interest to 
determine the three-space geometries through appropriate reflection groups, 
along the lines on which this was done for the plane by Cairns. While it is 
clear that suitable conditions on “reflections” of three-space could be made 
to yield the axioms of this paper, the characterization of the fixed points of 
reflection of three-space by P. Smith might lead to an interesting approach. 

Added in proof: Kerékjarté6 has informed us that he has published a 
further paper on this same subject in the Proceedings of the Hungarian 
Academy of Sciences (1928) (in Hungarian). He is about to publish another 
paper on this subject in the Acta Mathematica. 


BIBLIOGRAPHY 


1. Alexandroff and Hopf, Topologie I, Berlin, 1935. 

2. Cartan, La Théorie des Groupes Finis et Continus et l’Analysis Situs, Mémorial des Sci- 
ences Mathématiques, vol. 42. 

3. Cairns, An axiomatic basis for plane geometry, these Transactions, vol. 35 (1933), pp. 
234-244. 

4. Hilbert, Grundlagen der Geometrie, 7th edition, 1930. 

&. , Uber die Grundlagen der Geometrie, Mathematische Annalen, vol. 56, pp. 381- 


, 
ay 


1940] RIGID SPACE GEOMETRY 49 

422. This article is reprinted as appendix IV, pp. 178-230, in the edition of Hilbert’s book re- 
ferred to above. 

6. Kerékjért6, On a geometrical theory of continuous groups, 11. Euclidean and hyperbolic 
groups of three dimensional space, Annals of Mathematics, (2), vol. 29, pp. 169-179. 

7. Montgomery and Zippin, Periodic one-parameter groups in three-space, these Transac- 
tions, vol. 40 (1936), pp. 24-36. 

8. , Compact abelian transformation groups, Duke Mathematical Journal, vol. 4 
(1938), pp. 363-373. 

9. , Non-abelian compact connected groups of three-space, American Journal of 
Mathematics, vol. 61 (1939), pp. 375-387. 

10. , Topological transformation groups 1, Annals of Mathematics, (2), vol. 41 (1940). 

11, , A theorem on the rotation group of the two-sphere, Bulletin of the American 
Mathematical Society, vol. 46 (1940), pp. 520-521. 

12. P. A. Smith, The topology of transformation groups, Bulletin of the American Mathe- 
matical Society, vol. 44 (1938), pp. 497-514. 

13. Veblen and Young, Projective Geometry, vol. 2. 


SmiTH COLLEGE, 
NORTHAMPTON, Mass., 

QUEENS COLLEGE, 
N. Y. 


CONFORMALITY IN CONNECTION WITH FUNCTIONS 
OF TWO COMPLEX VARIABLES 


BY 
EDWARD KASNER 


1. introduction. In the theory of functions of two complex variables 
z=x+iy, w=u-+iv, the transformations of importance are Z=Z(z, w), 
W=W¢(z, w) where Z and W are general analytic functions (power series) 
such that the jacobian Z,W.—Z.W, is not identically zero. Any pair of such 
functions may be regarded as a transformation from the points (x, y, u, v) 
to the points (X, Y, U, V) of a given real cartesian four-space S,. Poincaré 
in his fundamental paper in the Palermo Rendiconti (1907) called any such 
correspondence a regular transformation. We employ also the term pseudo- 
conformal transformation. The totality of these transformations forms an 
infinite group G. This is mot the conformal group of the four-space .S, as is the 
case for the infinite group of analytic functions Z=Z(z) of a single complex 
variable z=x+7y. As a matter of fact, the theorem of Liouville states that 
the conformal group of the four-space S, is merely the fifteen-parameter 
group of inversions. 

In this paper, we shall obtain several geometric characterizations of this 
group G of regular transformations. Our main theorem is that the group G 
of regular (or pseudo-conformal) transformations is characterized by the fact that 
it leaves invariant the pseudo-angle between any curve C and any hypersurface H 
at their common point of intersection. 

The pseudo-angle may be visualized geometrically as follows. Let a lineal 
element C and a hypersurface element H intersect in a common point p. 
Rotate the lineal element C about the point p into the hypersurface element 
H in the unique planar direction (the isoclinal planar direction), which has 
the property that the angle between any two lineal elements of the rotation 
is equal to the angle between their orthogonal projections onto the z- (and w-) 
plane. There is a unique lineal element C, in the hypersurface element H, 
which is the end result of this rotation. Our pseudo-angle is then the actual 
angle between the initial lineal element C and the terminal lineal element C; 
of this rotation. 

In conclusion we study Picard’s sixteen-parameter group, used in the 
theory of hyperfuchsian functions. The only pseudo-conformal transforma- 
tions actually conformal in S, constitute a nine-parameter subgroup. 

Another geometric interpretation of functions of two complex variables 
is obtained by using point-pairs (bipoints) in the plane; and this is easily ex- 

Presented to the Society, September 11, 1908; also at the Zurich International Congress, 
1932; received by the editors May 26, 1939. 


50 


CONFORMITY AND FUNCTIONS OF COMPLEX VARIABLES 51 


tended to ” variables by using m-points or polygons. See the Bulletin of the 
American Mathematical Society, vol. 15 (1909), p. 159. 

2. Isoclinal and reverse isoclinal planes. Before proving these geometric 
characterizations of the infinite group G, we shall have to consider some pre- 
liminary definitions and theorems. A surface s of the four-space S, is given by 
the two equations F(x, y, u, v) =0, Fe(x, y, u, v) =0, where F, and F; are two 
independent functions of (x, y, u, v). Let P,(x, y, u, v) be any point of the 
surface s. Construct the orthogonal projections P,(x, y, 0, 0) and P,,(0, 0, u, v) 
(by means of absolutely perpendicular planes) of the point P, on the z- and 
w-planes respectively. Thus the surface s induces (1) the correspondence R.w 
between the points P, and P,, of the z- and w-planes, (2) the correspondence 
R., between the points P, and P, of the z-plane and the surface s, and (3) 
the correspondence R,,, between the points P,, and P, of the w-plane and the 
surface s. We call Rw, Res, Rws the three correspondences associated with the sur- 
face s. The two correspondences R,, and Ry, are the result of orthogonal 
projections of the points of the surface s onto the z- and w-planes. The corre- 
spondence R,» is given by the equations Fi(x, y, u, v)=0, Fo(x, y, u, v) =0 
of the surface s. It is noted that any one of these three correspondences may 
be degenerate. 

Since any orthogonal projection of a plane upon a plane in the four-space 
S4 preserves parallel lines, we find that for a plane 7, each of the three associ- 
ated correspondences Rw, Ree, Rwr is an affine transformation. Conversely 
if any one of the three correspondences R.», Res, Rws associated with a surface 
s is an affine transformation, then all three are affine transformations and 
the surface s is a plane. Of course, all of these statements are equivalent to the 
fact that a plane of the four-space S, is given by two independent linear equa- 
tions in the unknowns (x, y, u, v). 

For a general plane 7, each of the associated correspondences R.w, Rix, Rue 
is an affine transformation. If the associated correspondence R, is a direct 
(or reverse) similitude, then 7 is termed an isoclinal plane (or a reverse iso- 
clinal plane). For an isoclinal plane, the correspondences R,, and Rus are 
both direct or reverse similitudes according to the choice of the positive sense 
of rotation of the angle in 7. Similarly for a reverse isoclinal plane 7, the 
correspondences R,, and Ry, are respectively direct and reverse or reverse 
and direct similitudes according to the choice of the positive sense of rotation 
of the angle in 7. Thus for an isoclinal or a reverse isoclinal plane, it is found 
that under each of the three associated correspondences Rw, Rie, Ruse the 
angle between any two lines is preserved. 

An isoclinal plane may be given by the single complex equation w=/z+m, 
where / and m are arbitrary complex. constants; whereas a reverse isoclinal 
plane may be given by the single complex equation w =/2+m, where =x —iy 
is the conjugate of z=x-+iy. Thus in the totality of * planes of the four- 
space S,, there are 4 isoclinal (or reverse isoclinal) planes. These «* iso- 


BOSTON 
CGLLEGE OF Lit RAL ARTS 


| 


52 EDWARD KASNER [July 


clinal (or reverse isoclinal) planes form a linear system of planes. Through 
any given point (or in any hyperplane) of the four-space S,, there are ©? 
isoclinal (or reverse isoclinal) planes. There is one and only one isoclinal (or 
reverse isoclinal) plane which passes through a given line of the four-space S,. 

We obtain the following three characterizations of the set of 2 0‘ isoclinal 
and reverse isoclinal planes among the totality of ©* planes of the four-space 
S,. (1) A plane 7z is an isoclinal or a reverse isoclinal plane if and only if at 
least one of the associated affine transformations R.w, Rzx, Rwx is a similitude. 
(2) The necessary and sufficient condition that a plane z be an isoclinal or a 
reverse isoclinal plane is that the angle between amy line L of m and its or- 
thogonal projection L, (or ZL.) onto the z- (or w-) plane is constant. This 
result gives the reason for the term isocline. Let ¢ (or y) be the constant angle 
between any line L of the isoclinal or reverse isoclinal plane 7 and its orthogo- 
nal projection L, (or L.) onto the z- (or w-) plane. Then ¢ and y are comple- 
mentary angles. (3) A plane 7 is an isoclinal or a reverse isoclinal plane if 
and only if the maximum and minimum angles between the plane 7 and 
the z- (or w-) plane are equal. The common value of the maximum and 
minimum angles between the isoclinal or reverse isoclinal plane 7 and the 
z- (or w-) plane is ¢ (or ¥). Thus an isoclinal or a reverse isoclinal plane 
makes complementary angles with the z- and w-planes. Also any area in any 
isoclinal or reverse isoclinal plane is equal to the sum or difference of its 
orthogonal projections on the z- and w-planes. Finally we note that for the 
isoclinal plane w=/z+/m or the reverse isocline plane w=/2+™m, the angle @ 
is arc tan |1|, where |/| denotes the absolute value of 1. 

3. Conformal and reverse conformal surfaces. The envelope of ~? iso- 
clinal (or reverse isoclinal) planes is called a conformal surface (or a reverse 
conformal surface). Upon finding the envelope of the ~? isoclinal planes 
w=lI(r,t)z-+m/(r, t) (or of the reverse isoclinal planes w =/(r,t)2-+m(r,t)) where 
l and m are complex functions of the real variables r and t¢, we find that any 
conformal (or reverse conformal) surface may be given by the single complex 
equation w=f(z) (or w=f(z)), where f is an analytic function of 2 (or 2). 
From this, it follows that a conformal (or reverse conformal) surface may be 
given by the two real equations u = u(x, y), v=v(x, y), where u and v are arbi- 
trary real functions of (x, y) which satisfy the Cauchy-Riemann equations 
Uz =Vy, Uy = —Vz (or the reverse Cauchy-Riemann equations —v,, Uy =0z). 

From the above facts, it easily follows that the correspondence R,, for a 
conformal (or reverse conformal) surface s is direct conformal (or reverse con- 
formal). For a conformal surface s, the correspondences R,, and Ry, are both 
direct or reverse conformal transformations according to the choice of the 
positive sense of rotation of the angle in s. Similarly for a reverse conformal 
surface s, the correspondences R,, and Ry, are respectively direct and reverse 
or reverse and direct conformal according to the choice of the positive sense 
of rotation of the angle in s. Thus for a conformal (or reverse conformal) sur- 


ii 


1940] CONFORMITY AND FUNCTIONS OF COMPLEX VARIABLES 53 


face s, each of the associated correspondences R.w, Ris, Rus preserves the angle 
between two intersecting curves. Conversely if at least one of the associated 
correspondences R.w, Rs, Rws of a surface s is conformal (direct.or reverse), 
then all three are conformal (direct or reverse), and s is either a conformal 
or a reverse conformal surface. 

4. Statements of our results. Under the group G of regular transforma- 
tions, every conformal surface is carried into a conformal surface. On the 
other hand, every reverse conformal surface is not carried into a reverse con- 
formal surface. A transformation T of the four-space S, 1s regular if and only 
af it converts every conformal surface into a conformal surface. The group G of 
regular transformations preserves the angle and also the sense of rotation be- 
tween any two intersecting curves contained in a conformal surface. Thus 
this group G induces the group of direct conformal transformations between 
the conformal surfaces of the four-space Sy. 

If two intersecting curves C; and C; are tangent to a conformal surface 
at their common point (or two hypersurfaces H; and Hp: intersect in a con- 
formal surface), then under the group G of regular transformations, the two 
curves C; and Cz (or the two hypersurfaces H; and Hz) possess the angle 
between them as the fundamental differential invariant of the first order. 
On the other hand, two intersecting curves C; and C2 not both tangent to a 
conformal surface (or two hypersurfaces H,; and He not intersecting in a con- 
formal surface) at their common point do mot possess any differential invari- 
ants of the first order under the infinite group G of regular transformations. 
This means that under the group G of regular transformations, any two con- 
current lineal elements not both contained in an isoclinal surface element (or 
two concurrent hypersurface elements not intersecting in an isoclinal surface 
element) can be converted into any other two concurrent lineal elements not 
both contained in an isoclinal surface element (or any other concurrent two 
hypersurface elements not intersecting in an isoclinal surface element). 

The simplest characterization of the group G of regular transformations is 
connected with the intersection of a curve and a three-dimensional variety. 
Let a curve C and a hypersurface H intersect in a point p. There is a unique 
isoclinal plane which passes through the point p and tangent to the curve C. 
Let C, be any curve through the point p which is tangent to this isoclinal 
plane and to the hypersurface H. All such curves C; are tangent to each other 
at the point p. The angle between the curve C and the curve C, is the funda- 
mental differential invariant of the first order between the curve C and the 
hypersurface H. This angle is called the pseudo-angle between the curve C 
and the hypersurface H. A transformation T of the four-space S, is regular if 
and only tf it preserves the pseudo-angle between any curve C and any hypersur- 
face H. Thus the infinite group G of regular transformations is characterized 
by the fact that it leaves invariant the pseudo-angle between every curve C 
and every hypersurface H. 


54 EDWARD KASNER [July 


In the final part of our paper, we shall give a brief discussion of the Picard 
sixteen-parameter group Gis of linear fractional transformations in w and z. 
If a regular transformation T converts 4 ©? isoclinal planes into isoclinal planes, 
then T carries every isoclinal plane into an isoclinal plane, and therefore T is a 
linear fractional transformation of the group Gis. For any other regular trans- 
formation 7, at most 3? isoclinal planes become isoclinal planes. 

To prove our theorems, we shall have to consider the lineal elements of 
the four-space S, which pass through a given point. Any lineal element 
through a fixed point may be defined by (px’, py’, pu’, pv’), where x’, y’, u’, v’ 
denote the differentials dx, dy, du, dv respectively, and p is any real nonzero 
factor of proportionality. However, to prove our results we shall find it more 
convenient to define any real lineal element through a given point by the com- 
plex coordinates (pz’, pw’), where 2’ = x’ + iy’, w’ = u’ + iv’, and p is any 
real nonzero factor of proportionality. 

5. The necessity of our results. Let T be the regular transformation 
Z=Z(z,w), W=W_(z, w). Let p(x, y, u, v) be a fixed point of the four-space S, 
and let P(X, Y, U, V) be the transformed point under the regular transforma- 
tion T. Then the special projective transformation between the two bundles 
of lineal elements through the points p and P, which is induced by the regular 
transformation T is given by the equations 


(1) pZ’ = az’ + Bu’, pW’ = x2’ + bw’, 


where a, B, y, dare 


1 
2 oy 


4 et at yt a 


0 1 
Ov 2 2 


Any hypersurface is defined by the equation H(x, y, u, v) =0, where H is 
any arbitrary real function of (x, y, u, v). Thus any hypersurface element 
through the fixed point p(x, y, u, v) is given by 


= X,-—1Xy, 
OZ 
p-—=—(= 
Ow 2 \ou 
(2) = — 
ow 
Ox 
= U,— iU yj, 
Ou 
= U,— 


1940] CONFORMITY AND FUNCTIONS OF COMPLEX VARIABLES 
(3) az’ + bw’ + az’ + bw’ = 0, 


where a and 0 are 


(4) 


2 Ov 


From (3) and (4), we see that any real hypersurface element through the fixed 
point p is defined by the complex coordinates (aa, ob) where @ is a real non- 
zero factor of proportionality. 

From (1) and (3), we find that the special projective transformation be- 
tween the two bundles of hypersurface elements through the fixed points p 
and P, which is induced by the regular transformation T, is given by 


(5) oa = aA-+ 7B, ob = BA + OB. 


Since the equation of any conformal surface is of the form w=f(z) where f 
is an analytic function of z, there follows from the equations of any regular 
tranformation T 


THEOREM 1. Under the group G of regular transformaiions, every conformal 
surface is converted into a conformal surface. 


Since every conformal surface becomes a conformal surface, it follows that 
under the group G of regular transformations, every isoclinal surface element 
is carried into an isoclinal surface element. This is also a consequence of equa- 
tions (1) upon observing that the equation of any isoclinal surface element 
through the fixed point p is w’ =/z’, where / is an arbitrary complex constant. 

Two lineal elements are said to be an isoclinal pair if they are contained in 
an isoclinal surface element. The condition for an isoclinal pair of lineal elements 
is 

(6) —= <= = complex constant (not real). 
1 


ai 


Two hypersurface elements are said to form an isoclinal pair if they intersect 
in an isoclinal surface element. The condition for an isoclinal pair of hyper- 
surface elements is 


ade 
(7) — = — = complex constant (not real). 
a by 


From equations (1) and (6), we obtain 


i 

1/a a 1 ‘ 

ity), 

2 oy 2 
|_| 1 


56 EDWARD KASNER [July 


THEOREM 2. Two intersecting curves C, and C2, which are tangent to a con- 
formal surface at their common point possess the fundamental differential invari- 
ant of first order 

we 
(8) amp — = amp 
Zi Wi 
This is the angle between the two curves C, and C,. It can be written in the real 
form 


(9) arc tan = arc tan . 
dx,dx_ + dyidye dujduz + dr, dv2 


By equations (5) and (7), we obtain the following dual result: 


THEOREM 3. Two hypersurfaces H, and H2 which intersect in a conformal 
surface possess the fundamental differential invariant of first order 


(10) 


This is the angle between the two hypersurfaces H, and H2. It can be written in 
the real form , 
— Ai uH Ai How 


= = arc tan 
Hi + AiyHey + 


(11) arc tan 


Let us now consider the case where two intersecting curves C; and C2 are 
not both tangent to a conformal surface at their common point. In that case, 
we can convert any non-isoclinal pair of lineal elements (9127, p1Wi) and 
(p2Z2' , p2 W? ) into the lineal elements (1, 0) and (0, 1), which of course are a 
non-isoclinal pair of lineal elements. The most general transformation of form 
(1) that will do this is 


(12) pZ! = + w’, pW’ = piWi 2’ + w’. 


This is an admissible transformation since the jacobian J 
—Z Wi ) is not zero. Hence we have proved that two intersecting curves 
C, and C; not tangent to a conformal surface at their common point have no 
differential invariants of the first order. The dual results for hypersurfaces are 
also valid. Thus we have 


THEOREM 4. Two intersecting curves C, and C2 not both tangent to a con- 
formal surface at their common point (or two hypersurfaces H, and H2 which do 
not intersect in a conformal surface) possess no differential invariants of the 
first order. 


Let C(z’, w’) be a given lineal element and H(a, b) a given hypersurface 


ay 
a2 be 
amp — = amp— - 
a b; 


1940] CONFORMITY AND FUNCTIONS OF COMPLEX VARIABLES 57 


element. There is a unique isoclinal surface element which contains the curve 
C(z’, w’). It is given by 


(13) =— =), 


where A is a complex constant (not real). Upon substituting this into the 
equation aZ’+bW’+4Z’+b5W’' =0 of the hypersurface element H(a, b), we 
find that the lineal element C, of intersection between the isoclinal surface 
element (13) and the hypersurface element H(a, bd) is given by the equation 
Ww’ 
(14) —=— = + bw’). 
w 


Since, according to Theorem 2, the angle between the curves C and C, is in- 
variant, we obtain 


THEOREM 5. A curve C and a hypersurface H which intersect in a common 
point possess the fundamental differential invariant of first order 


(15) 3m — amp (az’ + bw’), 


evaluated at the common point. This is called the pseudo-angle between the curve C 
and the hypersurface H. The pseudo-angle represents the angle between the curve 
C and any curve C, through the point p such that C and C, are tangent to a con- 
formal surface at the point p, and C, is tangent to the hypersurface H at the point 
p. Dually, we find thai the pseudo-angle represents the angle beiween the hyper- 
surface H and any hypersurface H, through the point p such that H and H, tnter- 
sect in a conformal surface and Hy, is tangent to the curve C at the point p. This 
pseudo-angle can be written in the real form 


Hdx + H,dy + + H,dv 
— H,dx + H.dy — H,du+ H,dv 


(16) arc tan 


The fact that this is the only differential invariant of the first order be- 
tween a curve C and a hypersurface H which pass through a given point p 
is an immediate consequence of equations (1) and (5). 

6. The sufficiency of our results. Let a general transformation T, 


(17) X= (x, y, Y= Y(x, 4, 2), U= U(x, y, 4, v), V= V(x, 2), 


be given. T is not necessarily a regular transformation. Let p(x, y, u, v) be 
a fixed point of the four-space S, and let P(X, Y, U, V) be the transformed 
point under the transformation 7. Then T induces the following general pro- 
jective transformation between the two bundles of lineal elements through 
the points p and P: 


4 
k 


EDWARD KASNER 


= + Xyy’ + + 
pY’ = Y.x' + Yyy’ + Yuu’ + 
pU’ = U,zx' + + + Uw’, 
pV’ = + Vyy’ + + V0’. 


(18) 


Changing (18) from the real notation (x’, y’, u’, v’) to the complex nota- 
tion by means of the equations 


Z' = X'+ iY’, x’ = + 2’), 
(19) 
W' =U'+ iW’, u’ = 3(w’ + wv’), 


we find that equations (18) may be written in the compact complex form 
(20) pZ’ = az’ + Bw’ + + pW! = y2' + bw’ + x2’ + wd’, 
where a, B, y, 5, x, w are given: 

1/90 


1 1 


1 
—)(U + iV) => (Uz +—(- Uy+V2), 


1 
(U + iV) = U» + 


0 0 
=) = Vi) + (Uv + V2), 


- U.— +— (Us + Va). 


The transformation (20) is thus the general projective transformation (18) 
between the two bundles of lineal elements through the two points p and P. 
Let the transformation T carry every conformal surface into a conformal 


58 [July 
2 
2 ’ 
B 
2 \dx 
1 0 
2 \ou Ov 
(21) 
in 1 
2 
1 
§ 
7 
o=— 
2 
| 


1940] CONFORMITY AND FUNCTIONS OF COMPLEX VARIABLES 59 


surface. Then T must convert every isoclinal surface element into an iso- 
clinal surface element. Hence (20) must carry every equation of the form 
w’ =lz’ into an equation of the same form. For this to be so, we must have 


(22) ¢=~=x=e=0. 


These are the double Cauchy-Riemann equations for the two complex func- 
tions X+7Y and U+iV. Hence these functions must be analytic functions 
of z and w. Thus 


THEOREM 6. Any transformation T of the four-space S, which converts every 
conformal surface into a conformal surface is a regular transformation. Thus the 
infinite group G of regular transformations is characterized by the fact that it pre- 
serves conformal surfaces. 


Next we shall prove that the pseudo-angle (the differential invariant (15) 
or (16)) of Theorem 5 characterizes the infinite group G of regular transforma- 
tions. Let the transformation T preserve the differential invariant (15) be- 
tween every lineal element c(z’, w’) and every hypersurface element h(a, 5) 
which passes through the common point p. Then under T we must have 


az’ + bw’ AZ' + BW’ 


23 = 
az’+ bw’ AZ’+ BW’ 


where the capital letters denote the transformed lineal element C(Z’, W’) and 


the transformed hypersurface element H(A, B). 

First we shall show that any iscclinal pair of lineal elements c;(21, wy ) 
and ¢2(z¢ , we ) is converted into an isoclinal pair of lineal elements C;(Z/ , Wi’ ) 
and C2(Z: , Wd ). Since , wi) and , we’ ) are contained in an isoclinal 
surface element, we must have 

2 
(24) r, 

1 w 
where X is a fixed non-real complex number. Let us pass any one of the «? 
hypersurface elements h(a, b) through the lineal element c;(z,' , wy’). Then un- 
der T the transformed hypersurface element H(A, B) must contain the trans- 
formed lineal element Ci(Z;', Wi). Hence we must have 


(25) asi +bwi + =0, AZ{ + BWi + AZi + BWi =0. 


Under the transformation T, the pseudo-angle between the lineal element 
C2(ze , we) and any one of the ? hypersurface elements h(a, b) through the 
lineal element c:(z/ , wi’) must be equal to the pseudo-angle between the 
transformed lineal element C2(Z:', W:' ) and the corresponding transformed 
hypersurface element H(A, B). This means that the equation (23) must be 
valid for these lineal and hypersurface elements. Then because of (24) and 


60 EDWARD KASNER 


(25), the equation (23) becomes 
<AZi + BW? 
AZi+BWi 


(26) 


This equation must be true for all the ~? hypersurface elements H(A, B) 
which pass through the lineal element Ci(Z/, Wy ). 
From this equation, and from the fact that +BWi =—(AZ; +BWy), 


(27) AZi + BWi =ipr, + BW! = ips, 


where pi and pe are arbitrary real numbers. Let us suppose that Z,’ /Z/ 
+ W/W. From these two equations, we can solve for A and B in terms of 
the arbitrary real numbers p: and pe. Thence A and B are linear homogeneous 
functions of p: and pz. This proves that the equation (26) can hold for only ! 
hypersurface elements passing through the lineal element C,(Z/, Wi’ ). This 
contradicts the fact that the equation (26) must hold for all the hypersurface 
elements H(A, B) through the lineal element C:(Z/, Wi). Hence we must 
have 


(28) 


This shows that the transformed lineal elements Ci(Z/ , Wi ) and C2(Zz’, W?! ) 
must be contained in an isoclinal surface element. Therefore every isoclinal 
pair of lineal elements is converted by T into an isoclinal pair of lineal ele- 
ments. 

Since any isoclinal pair of lineal elements is carried by T into an isoclinal 
pair of lineal elements, it follows that T carries every isoclinal surface ele- 
ment into an isoclinal surface element. Hence every conformal surface be- 
comes a conformal surface and the transformation T must therefore be a 
regular transformation. Thus we have proved 


THEOREM 7. Any transformation T of the four-space S, which preserves the 
pseudo-angle (the differential expression of the first order (15) or (16)) between 
every curve and every hypersurface evaluated at their common point must be a 
regular transformation. Thus the infinite group G of regular transformations is 
characterized by the fact that it leaves invariant the pseudo-angle between every 
curve and every hypersurface. 


7. The Picard sixteen-parameter group Gi of linear fractional transfor- 
mations. In this section, we shall give a characterization of the group Gis of 
the linear fractional transformations in z and w 


az + bw+ec az + bwt+ec 


[July 
Zi W? 
= Zi wi 
4 


1940] CONFORMITY AND FUNCTIONS OF COMPLEX VARIABLES 61 


Any transformation of the form (29) is a quadric Cremona transformation. 
It may be considered to be a direct generalization of the Moebius group of 
circular transformations. Of course, it is mot the inversion group of the four- 
space S,. As a matter of fact, any hypersphere (or any hyperplane) is con- 
verted by (29) into a special type of quadric hypersurface. 

Under any regular transformation T, let us find what isoclinal planes be- 
come isoclinal planes. For this to be so, the differential equation d*w/dz? =0 
must be carried into the differential equation d?W/dZ? =0. Hence those iso- 
clinal planes which become isoclinal equations under the regular transforma- 
tion T must satisfy the equation 


dw dw dw\? 
(z. + Ze) + 2— Www + (=) Wee| 
dz d 


dz 


dw dw dw\? 
(w. + = We) + 2 + (=) Zee | = 0. 


Z dz 


(30) 


First, if this equation is an identity in dw/dz, we find that Z and W must 
be given by the equations (29). That is, the group Gi. of linear fractional 
transformations as given by the equations (29) convert every isoclinal plane 
into an isoclinal plane. 

Next if the above equation is not identically zero, we can solve (30) for 
dw/dz and obtain at most three differential equations of the form 


dw 
(31) — = f(z, w), 
dz 


where f is an analytic function of 2 and w. Any such differential equation con- 
tains ©? solutions. Thus we have proved 


THEOREM 8. If a regular transformation T converts 4 ~* isoclinal planes into 
isoclinal planes, then every isoclinal plane is converted into an isoclinal plane, 
and therefore T is a transformation of the group Gig of the linear fractional trans- 
formations as given by equations (29). Any other regular transformation T con- 
verts at most 3 ©? isoclinal planes into isoclinal planes. 


It is found that, under the group Gig of fractional linear transformations 
as given by (29), the family of quadric hypersurfaces 


(32) azz + bw + yew + Joo + +ewt+ f =0, 


where a, b, f are arbitrary real constants and y, 6, € are arbitrary complex 
constants, is converted into itself. The real form of this family of quadric 
hypersurfaces is 


a(x? + y?) + b(u? + v7) + 2c1(ux + vy) + uy + vx) 


(33) 
+ 2dix + 2dey + + + f = 


= 
q 
| 
| 
| 
| 


62 EDWARD KASNER 


There are ~® hypersurfaces in this,family. Every hypersphere (or every hy- 
perplane) of the four-space S, becomes a special quadric hypersurface of the 
form (32) or (33). Also the intersection of any isoclinal plane with this special 
quadric hypersurface is a circle. Thus any transformation of the form (29) 
induces a Moebius circular transformation between the isoclinal planes of the 
four-space S;. In this respect, the group Gis of linear fractional transforma- 
tions in z and w may be regarded as a generalization of the Moebius group of 
circular transformations to four-space. Also the family of special quadric hy- 
persurfaces (32) or (33) can be considered to be a generalization of the family 
of circles. 

In conclusion I wish to express my thanks to Dr. J. De Cicco for his valu- 
able assistance in writing this paper. 


CoLuMBIA UNIVERSITY, 
NEw York, N. Y. 


- 
4 


ARC- AND TREE-PRESERVING TRANSFORMATIONS 


BY 
D. W. HALL(‘) AND. G. T. WHYBURN 


1. Introduction. In an earlier paper(*) by one of us, referred to hereafter as 
A.P.T., arc-preserving transformations were defined and studied in connec- 
tion with an irreducibility condition on the transformation. It was shown, 
for example, that if A and B are compact locally connected metric continua 
which are cyclic (that is, without cut points) any single valued continuous 
arc-preserving and irreducible transformation T(A)=B of A onto B is neces- 
sarily a homeomorphism. (“Arc-preserving” means that the image of every 
simple arc in A is either a simple arc or a single point of B; irreducibility of T 
means that no proper subcontinuum of A maps onto all of B.) It was shown, 
furthermore, that in case A is hereditarily locally connected the same con- 
clusion holds without the assumption of irreducibility; and the prediction was 
made that this is true in the general case. 

Now as pointed out in A.P.T., if A is a compact continuum and 7(A)=B 
is continuous, then, since the property of being a subcontinuum of A mapping 
onto all of B under T is inducible, there always exists a subcontinuum A; 
of A such that 7(A:)=B and T is irreducible on A:. However, since local 
connectedness of A would certainly not in general insure local connectedness 
of A, it is not possible always to reduce the set A so as to make the trans- 
formation irreducible without sacrificing essential properties of A. 

In the present paper we propose not only to completely justify the earlier 
prediction referred to above, but also to obtain theorems concerning a much 
more general type of transformation than “arc-preserving” which will give 
all the theorems of the first three sections of A.P.T. as immediate corollaries. 

R. G. Simond(*) has studied tree-preserving transformations on locally 
connected compact and metric continua (that is, transformations T(A)=B 
satisfying the condition that the image of every tree (or dendrite) in A isa 
tree in B). Miss Simond has proved with considerable difficulty that every 
arc-preserving transformation is tree-preserving. We show that this result 
is an immediate consequence of one of our theorems, as it is also of a theorem 
of A.P.T. In fact our first principal result, the proof of which is very much 
simpler than that given by Simond, shows that in order that T(A)=B be 
tree-preserving it is necessary and sufficient that the image of every simple 


Presented to the Society, December 28, 1939; received by the editors January 8, 1940. 

() This work was started when the first named author wasa National Research Fellow at 
the University of Virginia. 

(?) See G. T. Whyburn, Arc-preserving transformations, American Journal of Mathematics, 
vol. 58 (1936), pp. 305-312. 

(*®) Duke Mathematical Journal, vol. 4 (1938), pp. 575-589. 


63 


64 D. W. HALL AND G. T. WHYBURN [July 


arc in A shall be a tree in B. We also give an independent proof of the Simond 
theorem. 

If A is a tree, every simple arc in A is a cyclic chain(*) in A. This shows 
at once that if A and B are both trees and 7(A) =B is continuous, then in 
order that T be arc-preserving it is necessary and sufficient that T be cyclic 
chain-preserving. This immediately suggests another of our main results: If A 
is a compact locally connected continuum and 7(A) =B is continuous, then 
in order that T be arc-preserving it is necessary and sufficient that T be both 
tree-preserving and cyclic chain-preserving. We give this theorem added 
meaning by obtaining a somewhat unexpected characterization of tree-pre- 
serving transformations in terms of the action of these transformations and 
their inverses on the sets A and B. 

In §5 a characterization of A-set reversing transformations(5) is given 
which supplements the treatment of this type of transformation initiated in 
A.P.T. 

In conclusion we might mention that if B is cyclic, the following types of 
transformations are equivalent: (a) arc-preserving, (b) tree-preserving, (c) A- 
set reversing, (d) monotone retracting. 

Throughout the paper all transformations are assumed to be single valued 
and continuous and all continua compact and metric. 

2. Principal results. We assume throughout this section that A and B are 
locally connected continua and that 7(A)=B maps every arc of A onto a 
dendrite in B. 


(2.0) The image of every dendritic graph in A is a dendrite in B. 


Proof. (By induction on the number of end points.) If the number of end 
points in the graph is 2, then the hypothesis gives the conclusion since the 
graph is an arc. Suppose that any dendrite in A having k or less end points 
(k>0) maps onto a dendrite in B. Let D be a dendrite in A having k+1 end 
points. Let p be a branch point in D giving a decomposition D=D,+D2+Ds3, 
where D,, D2, and D; are dendrites intersecting by pairs in p. Since D, De, Ds, 
D,+D2, Di+D;, D2+Ds are dendrites having at most k end points, each of 
their transforms is a tree. Hence 7(D,)-T(D3) and T(D2)-T(Ds) are con- 


(4) If M is a locally connected compact and metric continuum and A isa closed subset of M 
containing every simple arc axb of M such that a and bare points of A, then A is called an A-set. 
By the cyclic chain in M determined by two points a and b of M and designated by C(a, b) is 
meant the product of all A-sets in M containing both a and 3, It is the minimal A-set in M 
containing both these points. The cyclic chains in M are closely related to the decomposition 
of M into its cyclic elements, for which see Kuratowski and Whyburn, Fundamenta Mathe- 
maticae, vol. 16 (1930), pp. 305-331. See also G. T. Whyburn, American Journal of Mathe- 
matics, vol. 50 (1928), pp. 167-194, and W. L. Ayres, these Transactions, vol. 30 (1928), pp. 
567-578, and vol. 31 (1929), pp. 595-695. 

(®) For definition, see §5. 


° 

t 

t 


1940] ARC- AND TREE-PRESERVING TRANSFORMATIONS 65 

nected ; and since both of these sets contain T(p), their sum T(D;)-T(Di+Dz) 

is connected. Thus T7(D) = 7(D,+D2+Ds) is a dendrite. 


(2.1) If B is cyclic, T is A-set reversing. 


Proof. Otherwise there exists a simple arc b\xbe in A such that T(d;) 
=T (be) =b, but T-1(b) - bixbe Let T(bixbe) =X and K =7-1(X). From 
(2.0) it follows that X is a dendrite; hence X ¥B. Thus A —K +0. 


(i) For any component R of A—K, T(F(R)) is a single point(*). 


Otherwise there exists a simple arc cyd in R+c+d such that ¢ and d lie 
in F(R) and T(c)#T7(d). But now if both c and d are on b’xb” (in the order 
b’, c, d, b’’) we let t=b’c+cyd+db"; if not let ¢ be a dendritic graph in A 
which contains both bixbe and cyd. This is impossible by (2.0) since in either 
case T(t) must contain a simple closed curve. 


(ii) There exist two components Rand S of A—K such that a=T(F(R)) 
#T(F(S)) =c and T(R)-T(S) #0. 


For let R be any component of A —K and let Q be the sum of all those 
components U of A—K such that 7(F(U))=T(F(R))=a. Then since 
Q+T7~-'(a) is closed, it follows that T(Q)+<a is closed. Since a is not a cut 
point of B, it follows that some point x of T7(Q) must be a limit point of 
B—T(Q). (We know that 7(Q)+a+B since X is not a single point.) But Q 
is open in A; consequently 7~—!(x) must intersect some component S of A —K 
which does not belong to Q. Thus if we set c=7(F(S)), (ii) is satisfied. 

Now to prove (2.1), let y’ and y’’ be points of Rand S, respectively, such 
that T(y’) =T(y’’) =y. Then there exists a dendritic graph ¢’ in A containing 
y’ and dyxbe. If t’ contains y’’, let t=t’. Otherwise there exists a dendritic 
graph ¢ in A containing both y’’ and ¢’. It is immediate that 7(¢) contains a 
simple closed curve, contradicting (2.0). 

We have at once: 


(2.11) If A and B are both cyclic, T is a homeomorphism. 


(2.2) In order that a single valued continuous transformation T(A)=B 
shall be tree-preserving it is necessary and sufficient that the image of every arc 
in A shall be a tree in B. 


Proof. The necessity is trivial. To prove the sufficiency suppose A is a 
tree and that B has a true cyclic element E;. Let W(B)=E, be monotone 
and retracting. Then W7(A) =&, is a transformation which maps arcs into 
trees. Thus, by (2.1), WT is A-set reversing, hence monotone. But this makes 
E, a tree, which is absurd. 


(*) For any open set G, F(G) denotes the boundary of G, that is, the set G-G. 


66 D. W. HALL AND G. T. WHYBURN [July 


(2.3) If Bis cyclic and no A-set in A other than A itself maps onto all of B, 
then A is cyclic and T is a homeomorphism. 


Proof. For if A had a cut point ~, we could write A = Ai+ Az, where Ai 
and A; are A-sets with A1:-A2=p. Then since, by (2.1), T is monotone, either 
Ai—>p or A2—>p, say Ai1—>, contains the set T-!'(B—T(p)), as this latter set 
is connected. But this gives T7(A1) =B. 


(2.4) If B is cyclic, T is equivalent to a monotone transformation retracting A 
onto some true cyclic element of A. Thus, in this situation, “arc-preserving,” 
“tree-preserving,” “A-set reversing,” and “monotone retracting” are all equiva- 
lent. 


Proof. For let E, be a minimal A-set in A mapping onto all of B under T. 
Then since T(£,) =B is a transformation mapping arcs into trees, it follows 
from (2.3) that E, is a cyclic element of A and E, maps onto all of B topologi- 
cally under 7. Thus if for each y in B we set h(y) =E,-T-'(y), then h is 
topological and the transformation h7T(A )=E, is retracting. Furthermore, 
hT is monotone since both hk and T are monotone. Obviously hT is equiva- 
lent to T, since A-!(hT) =T. 


DEFINITION. For any true cyclic element E, of A we define E2 as the set of 
all internal points of Ea, that is, all points x of Ea which are non-cut points of A. 
It is well known that the set of all non-internal points of any such E, is countable. 


(2.5) For each true cyclic element Ey of B there exists a unique true cyclic 
element E, of A such that T(E.) = Ey. The transformation T is a homeomorphism 
on E, and T-" is single valued on T(E2). 


Proof. For let W(B) = E, be monotone and retracting. Since WT(A) = Ey 
is a transformation which maps arcs into trees, it follows from (2.4) that there 
exists a true cyclic element E, of A and a homeomorphism h(E,) =E, such 
that hWT(A) =E, is monotone and retracting. Since E, maps topologically 
under hWT, it must therefore map topologically under T. Let y=T7T(x) bea 
point of T(E}), where x lies in E}. Since x is an internal point of E, and hWT 
is monotone and retracting, we see that x=(hWT)-(x) Wh“ (x) 
=T-W-(y) > T-"(y), and hence x = 7~-'(y). The uniqueness of E, follows 
at once from this single-valuedness of T-'. 


(2.6) Let A be a compact locally connected continuum and T(A)=B be con- © 
tinuous. Then in order that T be tree-preserving it is necessary and sufficient that 
for each true cyclic element E, of B there exist a true cyclic element E, of A map- 
ping onto E, topologically under T and such that T— is single valued on the set 
T (E,). 


Proof. The necessity follows from (2.5). To establish the sufficiency we 
need only, in virtue of (2.2), show that the image of every simple arc ¢ in A 


= 

4 


1940] ARC- AND TREE-PRESERVING TRANSFORMATIONS 67 


is a tree in B. Assuming the contrary, 7 (¢) must contain a simple closed curve 
J’. Let E, be the true cyclic element of B containing J’ and E, the true cyclic 
element of A which satisfies the conditions of the theorem. It follows at once 
that ¢-E, is a simple arc axb which maps into an arc a’x’b’ of J’. Let a’y'b’ 
be the other arc of J’, and suppose that y’ is the image of an internal point 
of E,. Then 7-(y’) contains a point of EZ, and a point of t—E,, which is 
impossible. 


(2.7) Let A be a compact locally connected continuum and let T(A) =B be 
continuous. Then in order that T be arc-preserving it is necessary and sufficient 
that it be tree-preserving and that the image of each cyclic chain(") in A bea 
cyclic chain in B. 


Proof. Necessity: The first condition is necessary by (2.2). That the second 
condition is necessary results essentially from the fact that, for arc-preserving 
transformations, (1) A-sets map onto A-sets, and (2) the property of having 
any three points on an arc is invariant. 

We first show that A-sets map onto A-sets. Let A’ be an A-set in A and 
T(A’)=B’. For any cyclic element E, of B intersecting B’ in at least two 
points, let E, be the corresponding cyclic element of A given by (2.6). Since 
E,-B’ is a nondegenerate continuum and 7(E,)=£, is topological, E,-B’ 
must contain the image y of at least one internal point x of Z,. Then since 
x=T-(y) we see that x must lie in A’. Thus E, is contained in A’, conse- 
quently E, is contained in B’. Therefore, B’ is an A-set in B. 

Now to prove the necessity of the second condition of the theorem, let 
C(a, b) be a cyclic chain in A. Then T(C(a, b)) =K is an A-set in B. Let x, y, z 
bé points of K and x’, y’, 2’ be points of C(a, b)-T-(x), C(a, b)-T—*(y), 
C(a, b)- T-*(z), respectively. There exists an arc cd in C(a, 6) containing x’, y’, 
and 2’. Hence T(cd) is an arc in K containing %, y, and z. Therefore, K is a 
cyclic chain (since for A-sets the property of being a cyclic chain is equivalent 
to the property of containing an arc through any three points). 

Sufficiency: Let ab be any simple arc in A. We first show that if E is any 
true cyclic element of B such that E,-T(ab) is nondegenerate and E, is the 
corresponding cyclic element of A given by (2.6) and xy is the arc E,-ab, then 
T (ax+~yb)-E,=x-+y. (We may suppose the order a, x, y, 6.) If this is not so, 
then T(ax)-E, or T(yb)- Ey, say T(ax)-E,, is a nondegenerate continuum; 
hence there must exist a point z distinct from T(x) of T(Z.°) which belongs to 
E,-T (ax). This contradicts (2.6), since intersects both E, and ax—x. 
Thus E,-T (ab) is a simple arc x’y’ = T(xy). Furthermore, no interior point of 
x'y’ is a limit point of T(ab)—<x’y’. 

Now if T(ab) is not a simple arc it cannot be a simple closed curve. This 
follows either from (2.2) or from what was just shown. Hence 7(ab) must con- 


(7) See footnote 4. 


‘4 


68 D. W. HALL AND G. T. WHYBURN [July 


tain a triod oc+-od+oe=t. But, since T is cyclic chain-preserving, T(C(a, b)) 
must contain a simple arc through the three points c, d, e, say cde. Then 
either cd or de does not contain 0, say de does not contain o. Thus od+o0e+de 
contains a simple closed curve J; containing nondegenerate subarcs od’ and 
oe’ of od and oe; and if E is the cyclic element of B containing J;, E,-T (ab) 
> E,-t >d’e’ =od’+oe' and o isa limit point of T(ab) —d’e’ contrary to what 
was shown above. 

3. Supplementary results. We give here some additional results, throwing 
light on the action of arc-preserving transformations. We assume throughout 
this section that A is a compact locally connected continuum and that 
T(A) =B is arc-preserving. 


(3.1) The image of each A-set in A is an A-set in B; the image of each cyclic 
chain in A is a cyclic chain in B. 


(3.2) For each true cyclic element E, of B there exists a unique true cyclic 
element E, of A which maps onto E, topologically under T. 


These are direct consequences of (2.6) and (2.7). 


(3.3) The image of every true cyclic element E, of A is either a single point, 
a true cyclic element E, of B, or a free arc of B which is also a cyclic chain of B. 
If T(E.) then T is topological on Eg. 


Proof. We have at once that 7(E,) is a cyclic chain C(x, y) in B, if we 


assume that it is not a single point. If C(x, y) is a single true cyclic element E, 
of B, then our conclusion follows at once. Hence we may assume that x and y 
are distinct and that there exists at least one point z which separates x and y 
in B. 

Let x’ and y’ be points of, E, mapping into x and y, respectively, and let J 
be a simple closed curve in E, containing x’ and y’, and define J’=7(J). We 
first show that J’ is a free arc of B. Regarding J as our space, we see that 
T is arc-preserving on J. If J’ contains a true cyclic element F of itself, then 
by (3.2), T(J) =F is a homeomorphism. This is impossible since x and y are 
separated in B by the point z. Thus J’ is a dendrite. But J’ contains no triod, 
since if it did we could easily find an arc of J having a triod in its image. 
Therefore, J’ is a simple arc. 

Assume that J’ =a’d’b’ is not a free arc of B. Then there exists a triod ¢ 
in B having d’ as center and a’, b’ as two of its end points, where d’ is some 
interior point of J’. Let c’ be the other end point of ¢. Then the three arcs 
a’d', b’d’, c’d’ of t are disjoint except for d’ and we may let {c/ } be a sequence 
of distinct points on c’d’ converging to d’ as a limit. It follows at once that 
there exists a point d in T—‘(d’) and a sequence of points {cx} in {7-'(c{)} 
converging to d. 

Since A is a locally connected continuum, there exists a region R in A 


1940] ARC- AND TREE-PRESERVING TRANSFORMATIONS 69 


containing } but disjoint from both T-‘(a’) and T-1(b’). This region is arc- 
wise connected and hence contains a simple arc cd, where c is one of the 
points 

If d lies on J we let z be the first intersection of cd with J, and consider 
an arc G defined as the sum of cz and an arc of J intersecting both T—‘(a’) 
and 7—1(b’). It is immediate that T(G) contains a triod, which is impossible 
since T is arc-preserving. Thus d does not lie on J. 

Let z be the first intersection of the arc cd with the closed set J7-!(J’), 
and define cxe as a simple arc in A having the unique point e in common with 
J, and on cxe let y be the last intersection with cz. Define an arc H as follows: 
(a’) if y is not z then H is the sum of the subarc zy of zc and the subarc ye 
of cxe; (b’) if y is z then His the sum of cz and the subarc ze of cxe. Let G 
be a simple arc of A composed of H and a subarc of J intersecting both T-*(a’) 
and 7~—'(b’). Then T(G) contains a triod, which is impossible. 

Thus J’ is a free arc a’b’ of B. Thus every point of the open subarc xy of 
a’b’ must separate x and y in B. Accordingly, C(x, y)=xy and C(x, y) isa 
free arc of B. 


(3.4)(°) If a and b are two points of A having the same image point under T, 
then no true cyclic element of the chain C(a, b) can map topologically under T. 
Thus each true cyclic element in C(a, b) maps into either a single point or a free 
arc of B. 


Proof. Let T(a) = T(b) and suppose there is a cyclic element E, of C(a, b) 
which maps onto a cyclic element Eg of B topologically. Let agrb be a simple 
arc in A where aqg-E,=q, rb-E,=r. Then either ag or rb is nondegenerate, 
and we may suppose ag is nondegenerate (rb may or may not be nondegener- 
ate). Then 7(aqg+rb)=K is a continuum. Furthermore, K- Eg contains the 
two distinct points T(g) and T(r). Hence K- Eg contains a point x distinct 
from both T(g) and T(r) which is the image of an internal point xo of E.. 
This is impossible, since T~'(x) also intersects ag+rb—(q+r), whereas by 
(2.7) and (2.6), T—*(x) must consist of a single point. 

4. Dendrite-preserving property of arc-preserving transformations. If A 
is a dendrite (or tree) and 7(A) =B is arc-preserving, then it follows at once 
from (2.2) that T is dendrite-preserving. This fact can also be seen from (2.5), 
since if B had a true cyclic element E, then A would also have one. This 
dendrite-preserving property of arc-preserving transformations was first 
noted and proven by R. G. Simond(*) as mentioned in the introduction of 
this paper. However, it is interesting to note that it follows directly from 
(2.4) of A.P.T. by the reasoning just given above, since the irreducibility of T 


(8) This is closely related to (2.3) of A.P.T. and, indeed, yields (2.3) of A.P.T. as a special 
case 


(*) See footnote 3. 


“a 


70 D. W. HALL AND G. T. WHYBURN_ .« [July 


assumed in (2.4) of A.P.T. does not limit the generality because we can take a 
sub-dendrite of A on which T is irreducible. 

In view of the considerable length and difficulty of Miss Simond’s proof, 
the following one which is self-contained and independent of all other results 
on arc-preserving transformations may be of interest. 


THEOREM (Simond). Arc-preserving transformations are dendrite-preserv- 
ing. 


Proof. Let 7(A)=B be arc-preserving, where A is a dendrite. We first 
show: 


I. (Whether A is a dendrite or not.) If t=oa+ob+oc is a triod such that 
T(o)=0', T(a)=a’, T(b) =b’, T(c)=c’', and if t-T—'(a’+b’+c’) =a+b+<e, 
then T(oa)-T (ob) =T(o0a)-T (oc) =T(o0b)-T (0c) =T (0) =o’. 


For if, say, T(0a)-T (0b) contains a point q’ distinct from 0’, we may sup- 
pose the order a’, 0’, qg’, b’ on the arc a’o0’b’ = T(aob). Then T(ao) is a subarc 
a’o'q’ of a’o’b’. Hence the arc T(aoc) consists of a’o’g’ plus an arc g’c’ from 
q’ to c’ which contains neither o’ nor 6’. But then the arc T (boc) =b’0’c’would 
contain both the arc 0’q’b’ of a’o’b’ and the arc q’c’, which is impossible since 
clearly 0’g’b’+q'c’ contains a triod. This proves I. 

Now suppose, contrary to the theorem, that B has a true cyclic element 
B’. Let A’ be a minimal A-set in A such that T(A’) contains B’. Then since 
A’ is a dendrite but not an arc, there exists a point o in A’ and three continua 
X, Y, Z such that A’=X+YV+4Z and X-Y=Y-Z=Z-X =o. Let T(o0)=0’. 
Since B’ is cyclic and T(Y+2Z) does not contain B’, there exists a point q’ 
in B’—o’ and points x in X, y in (Y+Z), such that T(x) =T(y) =q’. Clearly 
we may suppose y in Y. Take the arcs xo and yo in X and Y respectively. 
Then since both 7 (xo) and Y(yo) contain arcs from o’ to q’ whereas T (xo+oy) 
must be a simple arc, clearly T (xo) - T (oy) contains an arc from o’ to g’. Hence 
there is no loss of generality in assuming (as we shall do) that both x and y are 
cut points of A’. Let R, and R, be components of A’—x and A’—y lying in 
X-—x and Y—y respectively. Then since no one of the sets T(A’—R,), 
T(A’—R,), T(X+Y) contains B’, there exist points a’, b’, and c’ in B’ such 
that T-'(a’)-A’cR,, T-1(b’)-A’ cR,, Let ae T—'(a’), 
b e« T-1(b’), and c © T-(c’) be so chosen that for the arcs ax, yb, and oc in A’ 
we have ax-T-‘(a’) =a, yb-T-1(b’) =b, and T-‘(c’)-oc=c. Let oa=ox+<xa, 
ob =oy+yb. Then t=0a+-0b+0c is a triod satisfying the conditions in I. How- 
ever, since each of the sets T(oa) and T (0b) contains both o’ and q’ we havea 
contradiction to I. Thus B can have no true cyclic element and hence must 
be a dendrite. 

5. A-set reversing transformations. In conclusion, we give a characteriza- 
tion of A-set reversing transformations which is made possible by (2.6) and 
(2.7). We recall that T(A)=B is A-set reversing provided that for each } 


> 
| 
a 
> 


1940} ARC- AND TREE-PRESERVING TRANSFORMATIONS 71 


in B, T—'(b) is either a single point or an A-set in A, it being assumed that A 
is a compact locally connected continuum. We make use here of certain re- 


sults concerning this type of transformation which were established in §4 of 
A.P.T. 


THEOREM. If A is a compact locally connected continuum and T(A)=B is 
arc-preserving, then in order that T be A-set reversing it is necessary and suffi- 
cient that the following conditions hold: (a) there exists no true cyclic element E 
in A such that T(E) is a free arc of B; (b) if K is the set of all cut points and end 
points of A, then T is monotone on T(K). 


Proof. The necessity follows at once from the definition and the fact that 
every A-set reversing transformation is monotone (A.P.T., (4.12)). 

Sufficiency: By (A.P.T., (4.1)) we must show that T is monotone on each 
simple arc in A. If this is not the case, there exists a simple arc axb in A such 
that T(a)=T(b)#T (x) for any point x interior to axb. If a and b are con- 
jugate points, they lie in the same true cyclic element E of A and it is immedi- 
ate from (a) and (3.3) that T is monotone on axb. Thus there exists a point g 
interior to axb which separates a and b in A. It follows at once from (b) that a 
is an internal point of a true cyclic element E of A which maps topologically 
onto a true cyclic clement F of B. But T—' is single valued on T(E°), where E° 
represents the set of all internal points of E. 


BROWN UNIVERSITY, 
PROVIDENCE, R. I., 

UNIVERSITY OF VIRGINIA, 
CHARLOTTESVILLE, VA. 


{ 
i 


ORTHOGONAL POLYNOMIALS WITH AUXILIARY 
CONDITIONS 


BY 
DUNHAM JACKSON 


1. Introduction. Let U,(f), Us(f),---, Um(f) be m linear functionals, 
m=1, each defined for a class of functions f including all polynomials in a 
single variable x. The characterization of a functional U(f/) as linear means 
here merely that if f; and f. are any two functions to which the operation 
applies, U(c:fit+cefe) (fi) (fe). This paper is concerned with sets of 
polynomials p,(x) orthogonal on an interval (a, 6), and satisfying the auxiliary 
conditions U;(p,) =0 for 7=1, 2, - - - , m, and for each value of n. Two of the 
simplest special cases, one corresponding to the single condition p,(1) = pa(— 1) 
and the other to the condition p,(1) = —p,(—1), have already been discussed 
elsewhere(‘). It will be shown here that certain formal propositions with re- 
gard to such orthogonal systems can be stated with a considerable degree of 
generality, while the theory of convergence is carried appreciably beyond the 
stage previously attained. 

2. Construction of the orthogonal system. If 


pa(x) = dno + Gnix + +--+ + AnnX", 
it follows from the property of linearity that 


= vi = U;(x*). 


k=0 


To the given set of auxiliary conditions there corresponds a matrix 


Y20 °° 


(1) 


Ymo0 Ymi 


Presented to the Society, September 7, 1939; received by the editors January 11, 1940. 

(4) See D. Jackson, A new class of orthogonal polynomials, American Mathematical Monthly, 
vol. 46 (1939), pp. 493-497. 

Since the present paper was written and since publication of the article in the Monthly, 
I have received through the kindness of Professor Mauro Picone a reprint of a paper by 
Wolfango Grébner, Sistemi di polinomi ortogonali soddisfacenti a date condizioni, number 62 
of the Pubblicazioni dell’Istituto per le Applicazioni del Calcolo, Consiglio Nazionale delle 
Ricerche, Rome, 1939, which also initiates a theory of orthogonal polynomials with linear homo- 
geneous auxiliary conditions. That treatment and the one given here, however, diverge almost 
from the beginning as to methods and results to such an extent that there is very little duplica- 
tion. 


72 


\ 
— 
n 
4 


ORTHOGONAL POLYNOMIALS 73 


with m rows and infinitely many columns. Conversely, every such matrix, not 
consisting entirely of zeros, can be regarded as defining a set of m (not neces- 
sarily independent) linear homogeneous conditions U;(p,) =0, significant for 
an arbitrary polynomial. 

Let r, be the rank of the matrix of the first 2+1 columns of (1), and let 
If r,=r,1, there exist polynomials 


P,(%) = ado + + +--+ + 


with a, +0, satisfying the m conditions U;(P,) =0. For if a, is taken equal to 
1, the relations to be satisfied by do, - - - , dn_1 are 


n—1 

Vikdk = — Yin, t= 1,2,---,m, 

k=0 
and the condition 7,=7,_1 is precisely the condition that the matrix of this 
system of equations have the same rank as the augmented matrix. If 7, ~7rp_1, 
that is, if 7,» =7n1+1, the equations are incompatible; the same is of course 
true if instead of 1 any other value different from zero is assigned to ap, 
and there exists no polynomial of the mth degree with a,+0 satisfying the 
conditions. 

As n takes on the values 0, 1, 2, - - - , since 7, can never decrease, can never 
increase by more than one unit at a time, and can never exceed m, there 
will be at most m values of » for which there is no polynomial satisfying the 
conditions. If r, never attains the value m, the m conditions are linearly de- 
pendent; in the case of m independent conditions there are(*) just m excep- 
tional values of . It will be assumed henceforth that the conditions are in- 
dependent. 

Let polynomials satisfying the auxiliary conditions be constructed suc- 
cessively for all possible values of m, and let Schmidt’s process be applied to 
these polynomials. It will be understood that the definition of orthogonality 
and normalization involves a weight function p(x) which is non-negative, and 
positive on a set of positive measure on (a, 6). Let the orthogonal polynomials 
when normalized be denoted by p,(x), the subscript indicating the degree of 
the polynomial in each case. For convenience of notation, let p,(x)=0 for 
the excluded values of n, and also for such negative values of m as may enter 
into any of the subsequent formulas. 

Any polynomial of the mth degree satisfying the auxiliary conditions can 
be expressed linearly in terms of po, pi, --- , Pn. For terms in x", x*!, - - 
can be removed successively by subtraction of multiples of pn, Pat,---, 
leaving each time a polynomial which satisfies the conditions; when a degree 
is reached for which nou-trivial polynomials satisfying the conditions do not 


(?) See also Grébner, loc. cit., p. 30, where the conclusion is stated with reference to a less 
general system of auxiliary conditions. 


= 

| 

| 


74 DUNHAM JACKSON. [July 


exist, the leading coefficient in the corresponding remainder must already be 
zero. 

3. Recursion formula and Christoffel-Darboux identity. The ordinary 
procedure for setting up a recursion formula does not apply without modifica- 
tion, for if a polynomial satisfying the auxiliary conditions is multiplied by x 
the product does not satisfy the conditions in general. However, if each of 
the functionals U;(f) is expressible in terms of the values of f at a finite number 
of points in the form 


(2) = Caf(y) + + + Caf»), 


where the y’s are real, or else conjugate complex in pairs with corresponding 
conjugate complex coefficients, the conditions U;(f)=0 are satisfied by any 
polynomial which vanishes at y1, ye,---, y,. (As a matter of notation, the 
list 1, ---, y, is understood to include all the points that occur in any of 
the U’s; some of the coefficients C;; may be zero.) If g(x) is the product 


(% — y1)(% — ya) (2 — »), 


or any polynomial divisible by this product (or, with trivial increase of gen- 
erality but with a possible slight gain in simplicity or convenience, a constant 
plus any such polynomial (*)), g(x)p,(x) satisfies the conditions for each value 
of n, and is expressible linearly in terms of Po, - - - , Pat», Where u2v is the 
degree of g(x). By the property of orthogonality, the coefficient of p(x) in this 
representation is zero for k<m—uy, and the representation has the form 


n+p 


k=n—p 


f 


These formulas hold for all non-negative integral values of » without excep- 
tion, on the basis of the convention introduced above according to which 
px(x) is identically zero when not defined otherwise. 

From the recursion formula a Christoffel-Darboux identity can be derived 
in the usual way. 

Similar reasoning is possible if U;(f) involves a finite number of deriva- 
tives at the points y;. If e; is the order of the highest derivative involved at y;, 
q(x) as defined above is to be replaced by 


II (x yar, 
j=1 


(*) E.g. in the earlier paper referred to, American Mathematical Monthly, loc. cit., the y’s 
being the points +1, x* was used as multiplier instead of x*—1. 


with : 
* 


1940] ORTHOGONAL POLYNOMIALS 75 


or by a polynomial divisible by this product, or by a constant plus such a 
polynomial. 
On the other hand, if there is just one auxiliary condition Ui(f) =0, where 


Ui(f) = "fade, 


there is certainly no polynomial g(x) (other than a constant) such that 
q(x)P»(x) satisfies the condition for all values of m. For that would require 
that g(x) be orthogonal to every polynomial whose integral over (—1, 1) 
is zero, and so orthogonal to every Legendre polynomial of positive degree, 
and such a polynomial is a constant. There is no recursion formula which ex- 
presses g(x)p,(x) linearly in terms of the p’s for all m, with a polynomial factor 
q(x). 
In the case of a single auxiliary condition Ui(f) =0, with 


Ui(f) = Cif(y) + Cof(y2) C.f(y»), 


the one exceptional value of m for which p,(x)=0, the smallest value of n 
for which U,(x")#0, cannot exceed yv—1. For the equations U;(x*) =0, 
k=0, 1,---, »—1, constitute a set of linear equations for the C’s, having 
for its determinant the nonvanishing Vandermonde determinant of the pow- 
ers of the y’s. That is to say, Ui(x*) cannot vanish for all these values of k 
unless the C’s are all zero. 

If there are m (linearly independent) conditions of the form (2), at least 
one m-rowed determinant of the first y columns of (1) must be different from 
zero; f, =m for n2=v—1, and p,(x) is non-trivial for all values of m2v. For 
if all the m-rowed determinants in the first vy columns were zero, the m sets 
of quantities U;(x*), k=0, 1, - --, »—1, would be linearly dependent; there 
would be numbers §;, - - - , bm, not all zero, such that 


+ + Coy) = 0, k=0,1,---,v-—1; 


that is, 
Clan CH= bLy. 


By the argument of the preceding paragraph all the coefficients C/ must 
vanish, which means that the m sets of coefficients Cy, - - - , C;, are linearly 
dependent. 

4. Boundedness of the normalized polynomials; convergence. If f(x) is 
an integrable function on (a, 5), it can be formally expanded in a series of the 
polynomials p,(x), the coefficients being determined in the usual way. When 
there is a Christoffel-Darboux identity, it can be used for the study of con- 


| 
| 
| 
| 
| 


76 DUNHAM JACKSON”) [July 


vergence in the same way as in connection with other orthogonal systems(‘), 
if the polynomials p,(x) are bounded as becomes infinite, at the point where 
convergence is to be proved. 

The discuss‘on of boundedness here will be not so much a general theory 
as an exploration of the effectiveness of particular types of hypothesis leading 
to the property in question. The auxiliary conditions will in each case involve 
the values of the polynomials, or of the polynomials and their derivatives, 
at only a finite number of points, and even with this limitation will be rather 
highly specialized in form. The interval of orthogonality will for simplicity 
be taken as (—1, 1). The weight function, while open to subsequent general- 
ization, will in the first instance be taken as unity. 

Consider first the single condition pa(y1) =f2(—1). With y: =1, this was 
treated in the earlier paper to which reference has been made. The condition 
is satisfied by any even polynomial, and by any polynomial which is divisible 
by x?—y;. For m even let p,(x) denote the normalized Legendre polynomial 
of the mth degree. There is no polynomial of the first degree satisfying the 
auxiliary condition. For odd let p,(x) = where de- 
notes the polynomial of the kth degree in the orthonormal system correspond- 
ing to weight function (x*—y?)*. For m odd, mtn_2(x) is an odd polynomial, 
since the weight function is even. Inasmuch as any odd polynomial is or- 
thogonal to any even polynomial, the even and odd #’s together constitute the 
desired orthogonal system. The normalized Legendre polynomials are uni- 
formly bounded in any closed interval interior to (—1, 1). The same is true(5) 
of the polynomials (x* —yj)?,(x). Hence p,(x) is similarly bounded for odd 
as well as even 2, if y: is not interior to the interval (—1, 1), and is uniformly 
bounded in the interval except near the points +1, +, if y, is between —1 
and 1. 

Suppose there are two conditions, =pn(—ye). 
They are satisfied by any even polynomial, and by any polynomial divisi- 
ble by (x? —¥%) (x? —43). They are not satisfied by any polynomial of the first 
or third degree. The orthogonal system consists of the normalized even Le- 
gendre polynomials and the polynomials —y?)(x* —3)rn_4(x) with odd, 
where 7;,(x) denotes the general polynomial in the orthonormal system for 
weight (x? — y?)?(x? —y3)*. They are uniformly bounded throughout any closed 
interval interior to (—1, 1) and not containing any of the points +71, +4e. 
Theextension toan arbitrary number of conditions of the form p,(y;) = pa(—¥; ) 
is obvious. 

A set of conditions of the form p,(y;) = —pa(—¥y;) leads to similar results. 
For a single condition ,(y1) = —p.(—¥y:), the orthonormal system consists of 


(4) See e.g. D. Jackson, Series of orthogonal polynomials, Annals of Mathematics, (2), vol. 34 
(1933), pp. 527-545; Orthogonal trigonometric sums, the same Annals, vol. 34 (1933), pp. 799- 
814; A class of orthogonal functions on plane curves, the same Annals, vol. 40 (1939), pp. 521-532. 

(5) See e.g. D. Jackson, Series of orthogonal polynomials, loc. cit., pp. 534-535. 


1940] ORTHOGONAL POLYNOMIALS 77 


the normalized odd Legendre polynomials and the polynomials (x* — y7)1,_2(x), 
n=2, 4,---, where m, 72,--- are the even orthonormal polynomials for 
(x* —y?)? as weight function. 

With conditions of the form last mentioned, the ordinary proof of con- 
vergence, after the polynomials are known to be bounded, requires modifica- 
tion in one particular, because of the fact that the orthogonal system does 
not include a constant. Consider for definiteness the case of the single condi- 
tion pa(yi) = —pa(—y1). If f(x) is a function developed in series of the p’s, 
the partial sum of the series is given by 


sala) = ff 2), 8) = Pale), 


A polynomial of the mth or lower degree satisfying the auxiliary condition is 
reproduced by this formula exactly. For example, 


1 
x -f tK,(t, x)dt. 


1 


If f(x) can be represented in the form x(x), where (x) is a function of suffi- 
cient regularity, convergence can be treated by means of the formulas 


sala) — fle) = 6a) 


The assumption that f(x) can be represented in the form x@(x) is no essential 
restriction, as far as convergence at other points than x =0 is concerned, for 
it can be seen as in other cases that convergence at a point depends only on 
the behavior of the function in the neighborhood of the point. 

Occasion arises for a somewhat less simple treatment of the problem in 
connection, for example, with the auxiliary condition p,/ (1) = p,! (—1). This 
is satisfied by any odd polynomial; it is not satisfied by any polynomial of 
the second degree, but it is satisfied by a constant, or by any polynomial di- 
visible by (1 —x?)?. The even polynomials of the orthogonal system, however, 
do not consist merely of a constant and the polynomials (1 —x?)*q,(x), where 
the q’s are orthogonal for weight (1—<x?)*; for example, (1 —x?)go(x) is not 
orthogonal to a constant. 

Let po(x), p1(x), ps(x), pa(x), - > - be the orthonormal polynomials satisfy- 
ing the auxiliary condition, and let £(x), &:(x), &(x),-+- be the normalized 
Legendre polynomials. The odd ’s are the odd ¢’s. (It is readily seen, as in 
other problems having analogous features of symmetry, that the ’s of even 
degree are even polynomials, and those of odd degree are odd.) For n even let 


= 
| 
| 
A 
: | 
| 
‘ 


DUNHAM JACKSON 
Pn(x) = Cnoto( x) + Cnié1(x) 


1 
= 


The polynomial &,(x) —}x?&/ (1), with k even, has a vanishing derivative 
for x= +1, and satisfies the auxiliary condition. Hence p,(x) is orthogonal 
to it when n>k: 


i.e., if the last integral is denoted by gn, Cnt =Znék (1). 
If P(x) is the non-normalized Legendre polynomial, so that &(x) 


= 
= R(R+1)/2, (1) = + 1)(2k + 


Since p,(x) is normalized, 


n—2 


1=f = = + + 2), 
| k=0 k=0 


the sign >,’ indicating summation over even values of k. The sum by which g? 
is multiplied is of the order of magnitude of m®, from which it follows that 
gn= o(1 / n*). 

Let 


a, = [2/(2k + 1)]*/Ef (1) = k(R + 1)/2, 


n—2 n—2 n—2 


Sn(x) = (1)Ee(x) = [(2k + = 


k=0 k=0 


Let 
on(x) = 


i=0 


for even k. Then (apo being zero) 


n—2 n—4 


S,(x) = ox-2(x) | + On—20n—2(%). 


Now, with summation extended over both odd and even values of k, 


k+1 — Ex(1) (x) 
[(2k + 1)(2k + 1— <x 


which, as & (1) =O(k/?), & (x) =O(1), does not exceed a constant multiple 


k 
> EXE = 


(3) 


1940] ORTHOGONAL POLYNOMIALS 79 


of k/? on a closed interval interior to (—1, 1). A similar statement holds 
if x is replaced by —x, and consequently holds for the even and odd parts 
of the sum separately. In particular, | ox(x)| =O(k/?) uniformly in any closed 
interval interior to (—1, 1). On the other hand, ary2,—a,=O(k). So 
| Sn(x)| =O(n/2), 

The relation (3) may be written 


= CanEn(%) + gnSn(*). 


From the preceding paragraphs, | gaSn(x)| =O(1/n/?). By application of 
Schwarz’s inequality to the integral defining the coefficient, | Cun <1. The 
polynomials p,(x) are uniformly bounded over uny closed interval interior to 
(—1, 1). 

An essentially similar problem is that associated with the condition 
Pa (1) = —pa (—1). The polynomials of even degree in the orthogonal sys- 
tem are Legendre polynomials. The requisite information about the coeffi- 
cients in the representation of the odd polynomials of the system in terms of 
Legendre polynomials comes from the fact that &(x)—xé&/ (1) satisfies the 
auxiliary condition when k is odd. 

Considerations of the same sort are effective in connection with the un- 
symmetric condition p,(1) =hp,(—1), where h is an arbitrary constant ~ +1. 
Here the orthogonal polynomials are neither even nor odd (°). In the represen- 
tation 


px(x) = > 
k=O 


the coefficients c,, for k<m are determined in accordance with the fact that 
&,(x) satisfies the auxiliary condition with u={1+h)/(1—h) when k 
is odd, and &(x)—&;(1) satisfies it when k is even. Hence, for k<n, 
Cnk =2née(1) or ugnée(1) according as k is even or odd, with 


1 
Pn(x)dx. 


1/2 


=1,  &(1) = [(2k + 1)/2]"”, 


it follows that g,=O(1/n). And as was noted above, | >> (1) Ex (x)| =O(n'/2) 
in the interior of (—1, 1), whether the summation is extended over all sub- 
scripts from 0 to m, or over the even subscripts or the odd subscripts of the 
set separately. Hence the desired conclusion with regard to the boundedness 
of the 

The condition »,(1)=0 leads merely to the set of polynomials 


(*) For an explicit determination of these polynomials see Grébner, loc. cit., pp. 46-47. 


Since | 
i} 

{| 

i} 

| 

|| 

| 

| 

| 

| 


80 DUNHAM JACKSON [July 


where the are orthonormal for weight (x—1)%. The con- 
dition p, (1)=0 appears to be less trivial; boundedness of the »’s can be 
proved by use of the observation that & (x) —xé& (1) has : a vanishing deriva- 
tive for x=1. 

A primitive example of higher order is the condition pi! (1) =0. It is satis- 
fied by & (x) —}x&f’ (1). So p,(x) is orthogonal to this expression for k<n, 
and if pa(x) => xCarée(x), then for k<n 


1 
Cnk = (1), £2. = 5x7 
=" 


Since (1) it follows by reasoning similar to 
that which has already been used that g, =O(1/n'), | (1)&(x)| =0O(n!?) 
in the interior of the interval, and the p’s are bounded as in other cases. 

Consider next the pair of conditions p,(1) =p,(—1), p2 (1) =p (—1). Be- 
cause of the symmetry of the problem, the orthogonal polynomials are even 
or odd, and the even and odd sequences can be considered separately. When n 
is even, the conditions are satisfied by &,(x) —}x*&, (1); when m is odd they 
are satisfied by &,(x) —xé&,(1). In each case it follows on the basis of calcula- 
tions which have been presented already that the p’s are bounded except near 
the ends of the interval. 

As a final illustration, of somewhat more general character, suppose there 
is a single auxiliary condition Ui(p,) =0 expressed in terms of the values of p, 
and an arbitrary finite number of its derivatives at the points +1. Let x* be 
a power of x, for simplicity the lowest power, such that U;(x*) 40. Let 


v(x) = &(x) — 
The coefficient A; can be determined so that 
Ui) = Ui [Ee(x)] — = 0. 
When A, is thus determined, p,(x) is orthogonal to p(x) if n>k, n>d: 


1 1 
= Argan, gn = DPn(x)dx. 
-1 


-1 


Since (— 1) =(—1)*+*¢ (1), there is a set of coefficients do, a1, , da, 
independent of k, with 2,0, such that 


Ax = + (1) + (1) 


when & is even, unless A; =0 for all even values of k, and a set of coefficients 
bo, b1, - - - , Bg, independent of k, with bg~0, such that 


Ax = bote(1) + (1) + + dete (1) 


. 
. 


1940] ORTHOGONAL POLYNOMIALS 81 


when k is odd, unless A; =0 for all odd values of k; here a and @ are in general 
equal to the order ¥ of the highest derivative occurring in Ui(p,), but one of 
them may in particular have a smaller value. At least one of the numbers a, 8 
is certainly equal to y, and at least one coefficient a, or by is present with a 
value different from zero. 

The quantity #{” (1) is of the order of magnitude of k27*+/2, and >-%-14? 
is not less than a positive constant multiple of m*7+? when 7 is sufficiently 


large. From the fact that if pa(x) => (x), it follows that 
=O(1/n?7+"). Since 


(y) (y) 2y—-1 
— [Ee (1)/Ee(1)] = OC), 
it may be shown by the use of partial summation in conjunction with the 
Christoffel identity, in the manner previously indicated, together with in- 
equalities obtained in the same way or, more simply, without resort to partial 
summation for the derivatives of lower order, that 


n—1 
k=0 


in the interior of (—1, 1). The p’s are bounded as before. 

It is apparent that the methods that have been used are capable of further 
extension. It is not so clear what the most general explicit formulation would 
be. On the other hand, it may be that some different method would lead to 
more general results at a single stroke. 


THE UNIVERSITY OF MINNESOTA, 
MINNEAPOLIs, MINN. 


| 
| 
| 


CONTINUOUS ADDITIVE FUNCTIONALS ON THE 
SPACE (BV) AND CERTAIN SUBSPACES 


BY 
C. RAYMOND ADAMS AND ANTHONY P. MORSE 


1. Introduction. We consider here the class (BV) of functions x(é) of 
bounded variation on the interval 


si <1). 
t 


Its intersection with the class (C) of functions continuous on J will be desig- 
nated by (CBV), its subclass of absolutely continuous functions by (AC), and 
(BV)—(CBV) by (DBYV). In a recent paper(+) Adams introduced for (BV) 
the metric 


(1) (s, 9) = J | x(t) — y(t) | de + | — THy)|, 


T)(z) being employed in general to denote the total variation of the function 
z(t) on J. Thus metrised, (BV) is not a Banach space(?); but it is complete, 
separable, and boundedly compact. Although a linear space, it is not a “linear 
topological space” in the sense in which that term is sometimes used, for the 
topology introduced by the metric (1) is non-uniform. Indeed it is easily seen 
that the category of a subset is not always invariant under translation. For, 
if the closed unit sphere K(@, 1) about the zero-element 0 as center, which is a 
set of second category in (BV), is subjected to the translation x where x has as 
a representative function x(t)=0 for 0<t<1, x(0)=2, then its translate 


E=Ely=«x+2,2eK] 
y 


is a subset of (DBV) and so of first category(*) in (BV). Regarded as a group, 
(BV) is discontinuous. 


Presented to the Society, December 2, 1939; received by the editors January 25, 1940° 

(1) Adams, The space of functions of bounded variation and certain general spaces, these 
Transactions, vol. 40 (1936), pp. 421-438, hereinafter referred to as A. The properties of (BV) 
mentioned presently are either explicitly established in, or easily to be inferred from the results 
of, this paper and its sequel by Adams and Morse, On the space (BV), ibid., vol. 42 (1937), pp. 
194-205, later referred to as AM. 

(?) Indeed it is clear, from the fact that “convergence in variation” is not additive, that it 
is impossible to norm the set (BV), or either of its subsets (CBV) and (AC), in such manner 
that convergence in the metric determined by the norm is equivalent to convergence in the 
metric (1). See Adams and Clarkson, On convergence in variation, Bulletin of the American 
Mathematical Society, vol. 40 (1934), pp. 413-417. This same remark holds for (BV) or (CBV) 
metrised with the distance function (17); see §6 below. 

(8) See AM, p. 199. An example of a residual set which under the same translation goes into 
a set of first category is provided by (CBV). 


82 


A . 
¢ ‘ 


FUNCTIONALS ON THE SPACE (BV) 83 


2. Functionals on (B V). The functional(*) f(x) =ess lim ¢.9 x(¢) =lim :.9 x(#) 
clearly is defined for every x © (BV) in the natural sense that if x(¢) is an ele- 
ment of the class (BV), f(x) is a real number, and if x(t) and y(¢) are both 
elements of the class (BV), (x, y)=0 implies f(x) =f(y). This functional is 
additive and homogeneous on (BV), and it may readily be seen to be con- 
tinuous at each point of the subset (B VN) corresponding to functions having 
no external saltus anywhere(*); nevertheless it is discontinuous at each point 
of (BV)—(BVN). Incidentally, both (BV N) and its complement are dense 
in (BV). In further contrast to the situation in the case of a Banach space, 
there exist functionals which are additive and continuous on (BV) with- 
out being uniformly continuous. But any functional f(x) which is additive 
and uniformly continuous on (BV) does satisfy a Lipschitz condition, 
| f(x) —f(y)| <M: (x, y) for x, y e (BV); and the smallest number M which 
can be used in this inequality we have called the “modulus” of f on (BV) 
and designated by the symbol mod :ay f. 

In A, Theorems 5.1, 5.2, and 5.3, it was shown that every functional f 
additive and uniformly continuous on (BV) [or on (CBV) or on (AC )] can 
be expressed in the form of a Lebesgue integral, 


1 
(2) f x(t)a(é)dt, ess SUPrer | | =M<o, 
0 


with mod av) f=M [or mod cay) f =M or mod cacy f= M]; and that each in- 
tegral of this kind is such a functional (*). An example of an additive and con- 
tinuous functional which is not uniformly continuous on (BV) is provided 
by any such integral with a(¢) summable but not essentially bounded. The 
general form of the additive and continuous, but not necessarily uniformly 
continuous, functional on (BV), however, was not determined in A. This open 
question we now propose to settle. 


THEOREM 1. The conditions Ti(x,)<B<« (n=0, 1, 2,---), lima. 
Je | xn—x0| dt =0, and g (C) imply Joxndg = Joxedg. 


Proof. This theorem is equivalent to Bray’s extension(’) of a theorem of 


(*) By functional we mean an operation or transformation whose range is contained in the 
real number system. We recall the well known fact that in a Banach space continuity of an 
additive functional f at one point alone implies continuity everywhere, uniform continuity, 
and the satisfaction by f of a Lipschitz condition on the entire space. 

(®) For a precise definition of (BV N) see the first paragraph of §3 below. 

(*) More recently Hildebrandt, in Linear operations on functions of bounded variation, Bulle- 
tin of the American Mathematical Society, vol. 44 (1938), p. 75, has determined the general 
form of the continuous additive functional on the non-separable Banach space which the class 
(BV) becomes when normed with ||x|| =| x(0)| +7%(x). As in the case of other non-separable 
Banach spaces previously considered by this author, the functional is expressed by a general- 
ized integral of Stieltjes or Lebesgue type which he constructs for the purpose. 

(7) See Bray, Elementary properties of the Stieltjes integral, Annals of Mathematics, (2), 
vol. 20 (1918-1919), p. 180. 


| 

1] 
Hi 

| 

| 

if 

| 

| 

| 

| 


84 C. R. ADAMS AND A. P. MORSE [July 


Helly, in the sense that each can be derived from the other. It seems prefer- 
able to us, however, to prove our result di novo rather than to use Bray’s 
theorem as a basis. That the first two conditions imply uniform boundedness 
of x,, and that all three imply lim,... /ox.dg = foxodg in the particular case in 
which g € (AC), has already been remarked in the first paragraph of the proof 
of Theorem 5.1 of A. We now extend the proof to the general case, in which 
g is an arbitrary continuous function. Let ¢ be any positive number; let B:x=>B 
be a bound for | xn(t) —xo(t)| (te I; n=1, 2, 3,---+ ); and let h(¢) satisfy the 
conditions 


he (AC), suprer | g(t) — S «/(2Bi), g(0) — 2(0) = g(1) — (1) =0. 


In accordance with the particular case of the theorem already proved we have 
limy.« Jo (Xn —X0)dh =0, whence as n— 


1 
lim sup f (%, — %o)dg| S lim sup f (a, — xo)d(g — h) | 
0 0 


1 
+ lim sup f — 
0 
1 
= lim sup f (g — h)d(xn — x0) 
0 
lim sup —— To ) 
=> sup 2B, %o 


€ 1 1 
S lim sup — [To(xn) + To(xo)] S €. 
2B, 
Each integral 
1 
(3) f x(t)dg(t) with ge (C) 


is a continuous additive functional for x © (BV). 


THEOREM 2. Each continuous additive functional on (BV) can be expressed 
in the form (3). 


Proof. Let f be any such functional and, as in the proof of Theorem 5.2 
of A, set 


1forOS ust, 


E(u) = S(&) = gd), tel. 


For any pair of numbers ¢, #4 in the interval OSt<1 we have (&, &:,) 
= f3|&:—&:,|du, so that tt, implies (&, This in turn implies 


1940] FUNCTIONALS ON THE SPACE (BV) 85 


f(E:) (E:,), since f is continuous at &;,; i.e., implies g(¢)—>g(t). If we let 
n(u) =1 for OSu<1, n(1) =0, it is clear that t—1 implies 7) = du 
—0, which implies f(€:)—>f(n); i-e., we may infer that lim;.ig(¢) exists. Hence, 
defining =g(t) for OS¢<1, 2(1) =limy.1 g(t), we have Ze (C) and ex- 
ists for every x e (BV). The argument used in the proof of Theorem 5.2 of A 
now holds, with no change whatever, from the beginning of the second para- 
graph up to and including equation (5.4); that is to say, it may be concluded 
that g(1) =2(1) and f(x) = ox(t)dg(t) for every xe (BV). 

3. Functionals on (CB V) and (AC). An arbitrary function x will be said 
to have no external saltus if and only if at each point #, e J, x satisfies the condi- 
tion lim inf;.., x(¢) Sx(t:) Slim sup:.:,x(t). We shall employ (B VN) to desig- 
nate the intersection of the class (BV) with the class of functions having no 
external saltus. Clearly x e (BV N) implies continuity of x at =0 and ¢t=1. 


THEOREM 3. The conditions(*) x, © (BV) (n=1, 2, 3,---), xe (BVN), 


(%n, 0) =0, | g(t)| <B< © forte I, and Jj gdx, exists(®) (n=0, 1,2, - - -) 
imply Jogdxn = Jogdxo and limn.. foxndg = 


Proof. In the same manner in which it may be seen that a closed set 
EcI can be inclosed in a finite set of disjoint intervals O; each open with 
respect to J and the sum of whose lengths exceeds the measure of E by arbi- 
trarily little, one may see that E can be inclosed in a finite set of such inter- 
vals with >>;76,(xo) exceeding the variation(*) of x» on E by an arbitrarily 


small amount. 

Let ¢ be an arbitrary positive number, k a positive number satisfying the 
inequality kT (x0) <¢, and D;, ¢J the set of points where g has a saltus =k. 
Since D;, is closed and the variation of x9 on D;, is zero, D; can be inclosed in 
a finite set of disjoint intervals O;, each open with respect to J, such that 
>> :To (x0) <€/(2B). Since the points of continuity of x» are dense in J and 
include 0 and 1, and since D, is closed, each interval O; can be shrunk (if 
necessary) into an interval O/, open with respect to J, whose end-points are 
points of continuity of xo and such that D, is still inclosed in }>,O/. By 


(8) From the proof it will be clear that a weaker set of conditions sufficient to insure the 
conclusion is obtained by replacing (xn, %0)->0 by the following: fil Xn— Xo| dt—0 and the exist- 
ence of a set D dense in J and containing 0 and 1 and of a non-decreasing function F such that 
SogaF exists and on every closed interval J¢ J with end-points in D, lim supp... 77(xn) does not 
exceed the increment of F on J. We take occasion to remark that this theorem neither includes 
nor is included by a theorem of Daniell on passage to the limit, Further properties of the general 
integral, Annals of Mathematics, (2), vol. 21 (1919-1920), p. 218. 

(*) It is desirable to recall here the meaning of the term “variation of a function on a set.” 
Let x e (BV) and EC /; then the variation of x on E is by definition the infimum of numbers 
of the form DiTo,(x) where Ec>..0;, each QO; is an interval open with respect to I (i.e., O; is 
the intersection with J of some open interval), and O; is the closure of O;. If g is bounded on J 
and x e (BV), a necessary and sufficient condition for /jgdx to exist is that the variation of x on 
the set of points of discontinuity of g be zero. See, for example, Hobson, Theory of Functions 
of a Real Variable, 3d edition, vol. 1, Cambridge, 1927, p. 542. 


| 
| 
| 


86 C. R. ADAMS AND A. P. MORSE [July 


Theorem 1 of AM, (xn, x0) —0 implies the same condition on each subinter- 
val Letting , Ta (xn) (n=0,1,2,--- ), we therefore 
have asn— © 


lim sup f sar - f san < lim sup f san + lim sup f sare 


S lim sup B- [Ta(*n) + Ta(xo) ] | 
2BT a( x0) < €. 


Let 6 denote the finite set of disjoint closed intervals constituting the closure 
of the point set J—a, at no point of which g has a saltus =k. By aid of the 
Heine-Borel theorem(!*) one may easily see that on each interval of 8, g can 
be approximated uniformly within k/2 by a continuous function h. We then 
have as n— 0 


lim sup f save ff gaze 
B B 


= lim sup fc — h)dx,, + f (g — h)dxo — f hdxo 
B B B 


Sli — h)dx, — — h)dxo| + lim hdin — | hdxo 
< lim sup \dxo| + sup x 
< lim sup (k/2)[Ts(an) + Ta(xo)] = kT < €, 


since limn.. /ghdxn = fghdxo is an immediate consequence of Theorem 1, the 
formula for integration by parts, and the fact that by Theorem 2 of AM we 
have(!!) pointwise convergence of x,(#) to xo(#) at each end-point of the in- 
tervals constituting 8. That /oxndg—/}xodg now follows at once by aid of the 
formula for integration by parts and the pointwise convergence of x,(t) to 
xo(t) at¢=0 and ¢=1. 


(#°) An explicit construction for #, on an interval which may as well be taken as J, is 
the following. Let M(t), N(¢) be the “maximum and minimum functions” for g (i.e., for ex- 
ample, let M(¢) =lima.osups, e <ag(t) for ¢ e I); the saltus of g at ¢ is then M(t) 
and this is <k for ¢ e J. Setting M,(#)=supz, [f() —n|ti—t| ], Na(t)=infee ] 
(te I;n=1, 2, 3,- +--+ ), we easily see that for each m these functions satisfy a Lipschitz condition, 
and that M,(#)—M(t) from above and N,(#)—N(¢) from below as n— ©, so that M,(t) —N,(¢) 
tends to M(t) — N(#) from above. Let 


En = E:[Mn(t) — Nn(t) = 


then each E, is closed, each En 3 Eny1, and TIE, is vacuous. Hence there erists an integer mo 
for which Ep, is vacuous; i.e., we have Mn,(t) —Nn,(t) <& for ¢ e J. The desired function h may 
now be taken to be [Mn,(t) +Nn,(t) ]/2 for te J. 

(#1) According to this theorem the conditions x, e (BV) (n=1, 2,3, +++), xe (BVN), and 
(Xn, Xo) 0 imply pointwise convergence of x, to xo at each point of continuity of xo. 


— | 


1940] FUNCTIONALS ON THE SPACE (BV) 87 


For convenience in stating the following corollaries we let (R) stand for 
the class of functions which are Riemann integrable on J and (R*) for the sub- 
class of (R) of which each function has only a countable number of discon- 
tinuities. Recalling() that each pair of conditions, x e (CBV), ge (R*) and 
x © (AC), ge (R), is sufficient to insure the existence of /¢xdg, we have 


CoROLLARY 1. Each integral 
1 
(4) f x(t)dg(t) with ge (R*) 
0 


1s a continuous additive functional for x © (CBV). 


COROLLARY 2. Each integral 


1 
(5) f x(t)dg(t) with ge (R) 


1s a continuous additive functional for x © (AC). 


In conjunction with Theorem 2, Corollary 1 shows that there exist con- 
tinuous additive functionals on (CBV) which cannot be extended to be con- 
tinuous and additive on (BV). In determining the genera! form of such a 
functional on (CBYV) it is therefore desirable, if not actually necessary, to 
work wholly within the space (CBV) itself. We propose to prove that the 
general form of such a functional on (CBYV) is (4) and on (AC) is (5). For 
this purpose we shall employ two lemmasas follows. Allowing (AC), to denote 
the subset of (AC) of which each function vanishes at t=9, we have 


Lemma 1. Let f be a functional whose domain includes (AC)o. If f is additive 
on (AC)o and continuous on (AC)o, metrised with the distance function 
(x, y) =79(x—y), there exists a function h bounded and summable on I such that 


(6) f(x) = h(t) for xe (AC)o. 


This result is an immediate consequence of the facts that (AC)p is iso- 
metric with the Banach space (L) of functions summable on J and that for 
x’ « (L) the general form of the continuous additive functional f(x’) on 
(L) is given(*) by the integral in (6). Since convergence in the metric 
(x, y)=7>(x—y) implies convergence in the metric (1), any functional f 
whose domain includes (AC) and which is additive and continuous on (AC) 
metrised with (1) can be expressed, for x © (AC)o, in the form (6). 

To promote our later convenience we formulate the 


(#2) See, for example, Hobson, loc. cit., p. 545. 
(38) See, for example, Banach, T forte des Opérations Linéaires, Warsaw, 1932, p, 65. 


4 
| 

| 

| 

| 

if 

| 


88 | C. R. ADAMS AND A. P. MORSE [July 


DEFINITION. The function h in (6) shall be called the normalized function 
associated with the functicnal f if and only if it has the properties 


h(O) = where x(t) = 1 forte Tl, h(i) = 0, 
7) h(t) = li du/é + lim inf 
(t) = =| im sups-o (u)du/é + lim infg.o 


forO<t<1. 


It should be clear from Lemma 1 and from this definition that associated 
with each functional f of the kind specified there 7s a normalized function h; 
and that if € is an arbitrary positive number and ¢; an arbitrary point in the 
open interval 0 <¢<1, there exists in every neighborhood of t; a set of positive 
measure on which A(t) is <h(t,)+e€ and a set of positive measure on which 
h(t) is >h(th) —e. 

It follows at once that any functional f whose domain includes (AC), and 
which is additive and continuous on (AC) metrised with the distance function 
(x, y) =|x(0) —y(0)| +T (x—y), can be expressed in the form("*) 


(8) f(x) = + f = x(0)h(0) + f for xe (AC), 


where h has the properties (7). 


LemMMA 2. Let 6 be an arbitrary number >0; P a non-vacuous perfect set 
cI; x(t) a continuous non-decreasing function("*) with x(0)=0, x(1)=1, and 
x’ (t)=0 forte I—P; and h(t) a bounded measurable function for t ¢ I, with es- 
sential saltus =k>0O at each point of P. Then there exist two non-decreasing 
functions X(t), w(t) © (AC)o satisfying the conditions 


suprer | A(t) — x()| <6, suprer| u(t) — x(t)| <8, 


(#4) The usual scheme for determining the general form of the continuous additive func- 
tional f on the Banach space (C) employs the device of extending f to points outside of (C); 
i.e., to the class of step-functions or to the entire space (M) of essentially bounded measurable 
functions. It may therefore be of interest to note that we now have in hand a means of obtaining 
the form of f without going outside of (C) itself. In fact from (8), by following essentially the 
same procedure as is used below in the proof of Lemma 3 (only that in the present instance x 
should be taken as x:(¢) =0 for ¢ e J, e should be taken as 1, and we have no concern with the 
values of fil %5,¢(t) | dt and one may readily show supsz ¢ = T)(h). Letting 
g(t) =h(0) —h(¢) for ¢ e J, and observing that (AC) is dense in (C), we may conclude f(x) = fix dg 
for x e (C) and |jfl| = = 

(45) One readily sees that, for any non-vacuous P, there exists a function x on J with these 
properties; for example, in any subinterval of J where P is dense, x(¢) may be taken as ¢, and 
elsewhere it may be defined by essentially the same process as is used for defining the Cantor 
ternary function. 


1940] FUNCTIONALS ON THE SPACE (BV) 


Proof. Let N be a positive integer large enough so that 
(10) | x(t:) — x(t2)| <& for | t: — #2| < 1/N, 


and let I, (n=1, 2,---, N) stand for the interval (n—1)/N<t<n/N. For 
each n we define functions A,, u, as follows. If P- J, is vacuous, set \,,(t) =p, (t) 
=0 forte I. If P-I, is non-vacuous, set 


S, = ess suprer, Sn = ess h(t), 
so that S,—s, 2k; let a,, 8, be any measurable subsets of J, with 


lan|>0, |S,—hk(t)|<k/4 for tean, 
| B.| > 0, | sn — h(t)| < k/4 for te B,; 


and set 


| an | for an, Ar,x/ | Bn | for Bn, 
= 


An(é) -{ 
0 for te I — an, 0 for te I — B,, 


where the notation | Z| stands for the measure of a set E and Ar,x represents 
the increment of the function x on the closure of the interval J,. We observe 


f = f h(t)n(t)dt 
> f (Sa — = (S — k/4)Ane, 


S (sn + k/4)Ar, x, 


and we assert that the non-decreasing f:inctions 


= dD rAn(u)du, w(t) = un(u)du 


0 n=l 


have the properties (9). It is clear that A(#)=y(t)=x(t) for t=n/N 
(n=0, 1,---, N), whence in view of (10) the first two of inequalities (9) 
are satisfied. As for the third, we have by (11) 


0 0 


N 1 N 1 
=> wm. f h(t)un(t)dt 
n=l 0 n=l“ 0 


> > (Sn — k/4 — — (h/2) > Anx = k/2. 


n=1 


89 
(11) | 
| 
| 
| 


90 C. R. ADAMS AND A. P. MORSE {July 


THEOREM 4. Each continuous additive functional on (CBV) can be expressed 
in the form (4). 


Proof. Each functional f(x) of this kind is expressible, for x e (AC), in the 
form (8). The function # can have an essential discontinuity at no more than 
a countable set of points; for the contrary would imply the existence of a 
number k>0 such that the points where / has an essential saltus =k would 
be a non-countable closed set, this set would contain a non-vacuous perfect 
set, and by Lemma 2 there would exist points x e (CBV) at which f fails to be 
continuous. Being normalized, / is continuous whenever it is essentially con- 
tinuous; hence h is continuous except at a countable set of points and we may 
write 


x(0)h(0) + f h(t)x'(t)dt = x(0)h(0) + f h(t)dx(t). 


This expression may be brought into the form (4) by setting g(t) =h(0) —h(é). 


THEOREM 5. Each continuous additive functional on (AC) can be expressed 
in the form (5). 


This result can be demonstrated in the same manner as Theorem 4, the 
conclusion that / cannot have essential discontinuities at a set D of measure 
>0 being drawn from Lemma 2. For, if |D| were >0, there would exist a 
k>0 such that the set D, ¢ D where h has an essential saltus =k would be 
closed and of measure >0; and D, would contain a perfect set P with | P| >0. 
The function x of Lemma 2 could then be taken as /3¢(u)du/| P|, where 
is the characteristic function of P ¢ J; i.e., there would exist points x © (AC)o 
at which f fails to be continuous. 

4. Norms of the functionals. By Theorem 2.3 of A, any functional f ad- 
ditive and continuous on (BV) satisfies a Lipschitz condition at the zero- 
element @ e (BV), with a Lipschitz modulus which was called the “norm” of f 
on (BV) and designated by the symbol ||f||(2v). On the assumption that 
g(0) =0, which can be made without loss of generality, the upper bounds 


(12) Tog), | g(1)| + suprer | 
and the lower bounds 
(13) Zoscrer g(t),  |g(1)|, super | g(4)| 


for ||f|| ev) were determined (*) in A. These show that if g is monotone on J, 
| fl| (sv) = Tg (g); in no other case, however, was the norm evaluated. We now 
propose to evaluate the norms of the functionals (3), (4), and (5). 


(#*) See A, pp. 437-438. It was tacitly assumed there that the functional f under considera- 
tion was uniformly continuous on (BV); but this assumption was not used, the bounds being 
determiued solely from the Stieltjes integral form of the functional. 


1940] FUNCTIONALS ON THE SPACE (BV) 91 


For convenience in this connection we adopt the following conventions. 
If gis a function on J and 


J=Elt 
t 


we define A/ g=g(t’’) —g(t’) when each of the points 2’, t’’ is an end-point 
of I or a point of continuity of g, Aj g=0 otherwise; | J | will stand for the 
length of J; and v(J) will be 0, 1, or 2 according as both, one, or neither of 
the conditions 0 ¢ J, 1 € J is satisfied. We then have 


THEOREM 6. The norm of each of the functionals (3), (4), and (5) ts 
(14) supy | A/g| /[|J| + »)] 
as J ranges over the set of closed subintervals of I. 


Proof. Since (AC) ¢(CBV) is dense in (CB V) and therefore in (BV), we 
see that proving the norm of the functional (5) to be given by (14) is tanta- 
mount to proving the theorem. We proceed to consider, then, the functional 
(5). 

Let (.S,) represent the class of step-functions(!”) each of which is continu- 
ous at each point of discontinuity of g. Since the points of continuity of g 
are dense in J, we infers that (S,) ¢ (BV) isdense relative to (AC). Now fox dg 
exists for x © (AC)+(S,), and we define 


F(x) = [xae for xe (AC) + (S,). 


Clearly we have F(x) =f(x) for x e (AC); and since (S,) is ¢ (BVN), we con- 
clude from Theorem 3 that F is continuous on (A C)+(S,). Thus we obtain (!8) 
1 


1 
fll uc = supee x dg = supze f x dg 
0 0 


1 


1 
= supze f = super « dg/|| «||. 
0 


0 


Let € be any positive number and ye (S,), with | | =1, satisfy the inequality 
1 
f y dg > ||flluc — 
0 


y shall now be regarded as fixed. Let (Sj ) be the set of step-functions x defined 
by the condition x e (Sj ) if and only if x is continuous at each point ¢ where y 


(17) It should be clearly understood that by a step-function we mean here a function con- 
sisting of a finite number of steps each of length >0. 
(38) See A, p. 430. We use the notation =(x, = x(t) | dt +73 (x). 


| 
i} 
| 
| 
| 


92 C. R. ADAMS AND A. P. MORSE [July 
is continuous. Then /oxdg, for x © (S/) and ||x|| =1, is a continuous bounded 


function of the heights of the steps in x and so assumes a maximum; thus 
there exists a particular step-function xo € (Sj ), with || xol| =1, such that 


1 1 
f x dg sf for xe (SZ), | =1, 
0 0 
which implies 
1 1 
f x dg/||x| < f for xe (S/), > 0. 
0 0 


Let éo satisfy the condition 
| Xo(t) | < | Xo(to) | forte Tl, 
and set 


= inf E[xo(u) = xo(t.) for t <u < él, 
t 
= sup E[xo(u) = xo(toc) for tp <u 
t 
Jo = St 
t 


Finally, let (S/ ) be defined thus 


(1 + d) xo(to) for teJo, 
x(t) => 
Xo(#) for teI —Jo. 


Then we have 


1 1 , 
f dg = f xodg + xo(to) 
0 0 


and since | xo(t)| is actually greater for t © Jo than it is for ¢ immediately to the 
left or right of Jo, there exists an 7 >0 such that we have 


|| xal] = |] + | xo(to)| - | Jo] + | xo(to) | fork > — 
The function 
[Jal] + | - | Jo] +] | 


is of the form (a+bd)/(c+dA) with cd>0, and it has a maximum at A=0. 
Hence we have H’(0) = (ad—bc)/c?=0 and a/c=b/d; i.e., 


H(A) = 


J =| Are | Jo] + 


1940] FUNCTIONALS ON THE SPACE (BV) 


Combining our results we obtain 


1 1 1 ’ 
—e<f f sig =| 20] +90]; 


and since ¢ is an arbitrary positive number, we conclude that || f|| 4c) is not 
greater than the number (14). 

On the other hand, if J is any closed subinterval of J and the function x; 
is defined by 


sgn (A/g)/[|J|+»J)] for teJ, 


j= 
for tel —J, 


we see at once 
1 
0 
whence 


1 
| A/g| /[|7| +»)] supee dg 
0 


1 
= supee x dg = 
0 


Thus Il l| (4c) is not less than the number (14), and the theorem is proofed. 

It may be worth while to point out here that the formula (14) provides a 
good basis for computation. For example, in the case of g(t) =4t(1—Z#), the 
value (14) is assumed for 


J=J,=El0 22-1], 
t 


for which v(Jo)=1, and the norm is 4(2/?—1)(2—2"?)/2'/2=.69 approxi- 
mately. In the case of g(0) =g(1) =0, g(.49) = —10, g(.51) =10 and g linear 
on each of the closed intervals [0, .49], [.49, .51], [.51, 1], the value (14) 
is assumed for 


J=Jo=E[.49 .51], 
t 


for which v(Jo)=2, and the norm is 20/2.02=9.90 approximately. Minor 
variants of this example show that ||fl|(sv) can be arbitrarily close to 
|g(1)| +supser |g(t)|; and other examples to indicate that each of the esti- 
mates (12) and (13) is in a sense the best possible can readily be constructed. 

We may observe also that the inequality | 7} (x) —To(y)| <73(x—y) im- 
plies that if x, is an arbitrary point in (BV), we have 


93 
7 
| 
| | 
i 


94 C. R. ADAMS AND A. P. MORSE [July 


sup ze (Bv),(2,2)>0 | f(*%) — f(a) | a1) 
= supze | f(x — x1) | /(x — m1, 0) 2 


i.e., that the Lipschitz modulus of a continuous additive functional f at any 
point in the space is never less than its Lipschitz modulus at the zero-ele- 
ment 0. 

5. Weak topologies in (B V). The two forms of functionals (3) and (2) re- 
spectively provide the basis for the following 


DEFINITIONS. A sequence x, (n=1, 2, 3,---) of elements of (BV) will be 
said to converge weakly (S) [to converge weakly (W) | if and only if lita... f (xn) 
exists for every functional f additive and continuous [additive and uniformly con- 
tinuous | on (BV). 


It is clear that convergence of a sequence x, in the metric (1) implies that 
X_, converges weakly (S), and that convergence of x, weakly (S) implies con- 
vergence of x, weakly (W). That implications do not hold in the reverse 
direction may be seen from more or less trivial examples. From (2), which 
is also the general form of the continuous additive functional on the Banach 
space (L), as has been remarked in §3, it is clear that the weak (W) topology 
of (BV) is equivalent to the topology introduced in (BV) ¢ (L) by the weak 
topology of (L). Since (BV) ¢ (L) is strongly dense in (ZL), it is apparent that 
the weak closure of (BV) ¢ (L) is (L). It has been shown earlier that (LZ) is 
weakly complete(!*). 


THEOREM 7. In the topology of weak (S) convergence, (BV) 1s complete. 


Proof. Let x, (n=1, 2, 3,---) be any sequence in (BV) satisfying the 
condition 


1 
(15) f exists for every ge (C); 
0 


and let #, be a function associated with x, (n=1, 2, 3, - - - ) as follows: 
a(t) for 0 <t< 1, 
0 for ¢=0,#=1. 
Then we have /oxndg = Jozndg for every g © (C) and every n. For fixed n, 
1 1 1 
0 0 0 
is a continuous additive functional(?°) on the Banach space (C), with norm 
equal to 7)(—#n) = 79(#n). The condition (15) implies that each g e (C) hasa 


(1*) See Banach, loc. cit., pp. 141-142. 
(2°) That is, linear functional, in the sense of Banach, loc. cit. 


1940] FUNCTIONALS ON THE SPACE (BV) 95 


bounded sequence of images under the sequence of transformations (16). It 
follows from a theorem of Banach and Steinhaus(?!) that the sequence of 
norms 79(%,) (n=1, 2, 3, - - - ) is bounded. Therefore the sequence of func- 
tions £, is uniformly bounded, and from a theorem of Helly(??) we conclude 
the existence of a subsequence Z,,, (¢=1, 2, 3, - - - ) which converges pointwise 
for all ¢ ¢ I to a function xe (BV). From Lebesgue’s convergence theorem we 
infer that #,, converges in the mean to x9; and from Theorem 1 above we 
conclude that and therefore tends to for every g e (C). 

One may readily verify the following remarks. 

(i) In each of the weak topologies, the weak limit of a sequence x, 
(n=1, 2,3,--- ) in (BV) is not unique in the sense of uniqueness determined 
by metric equality in (BV); it is, however, unique in the space (L). 

(ii) In contrast to the situation in a Banach space(**), boundedness of the 
sequence ||x,|| (n=1, 2, 3, - - - ), where ||x,|| = (xn, 0), is mot a necessary condi- 
tion for weak convergence, in either sense, of a sequence x, in (BV); but 
boundedness of the sequence Jo | xn(t)| dt (n=1, 2, 3,--- ) is of course neces- 
sary. 

6. The use of the metric 


(17) 9) = | — | ae + | 2a) — |, 


where Lj(z) stands in general for the (Peano) length of the function 2(¢) on J. 
_ When this metric is employed the situation is as described in the following 


THEOREM 8. Each uniformly continuous additive functional on (BV) [or on 
(CBV) or on (AC)| can be expressed in the form (2). The general form of the 
continuous additive functional on (BV) is (3), on (CBV) is (4), and on (AC) 
is (8). Conversely, each integral of the kind specified is such a functional on the 
corresponding space. The functional (8) on (AC) satisfies a Lipschitz condition 
at any given point x; © (AC) if and only if h(t) satisfies a Lipschitz condition on 
I; in this event the integral (8) can be expressed in the form (2), with g(t) =h(0) 
—h(t), and this integral defines a functional f on (BV) which satisfies a Lip- 
schitz condition on the entire space (BV); and the Lipschitz modulus of f on 
(BV) [or on (CBV) or on (AC)] is the same at each point x, of (BV) [or of 
(CBV) or of (AC)] as it is for the entire space, being equal to the Lipschitz 
modulus of g(t) =h(0) —h(t) on I. 


We shall endeavor to indicate the proof of these results without going into 
excessive detail. Naturally it must be noted at the outset that since the metric 
(17) is not homogeneous (i.e., does not satisfy the condition (ax, ay) 


(#*) See, for example, Banach, loc. cit., p. 80. 

(*) See Helly, Uber lineare Funktionaloperati , Sitzungsberichte der Wiener Akademie, 
Ila, vol. 121 (1912), p. 283. 

() See, for example, Banach, loc. cit., p. 133. 


| 
| 
i} 
| 
i 
| 
| 
| 
| 
| 
: ; 
if 


96 C. R. ADAMS AND A. P. MORSE [July 


= | a| -(x, y), for a a real number), the spaces considered are neither of the 
type (a) nor of the type (a*), so that the results of §2 of A cannot be drawn 
upon. In particular, there would seem to be no a priori reason why a uni- 
formly continuous additive functional should satisfy a Lipschitz condition (*); 
that such is the case, however, will presently appear. It is a simple matter to 
show that a continuous additive functional on (BV) or one of its subspaces 
here considered is homogeneous (*). 

Let f be any additive and uniformly continuous functional on (BV). As 
in the proof of Theorem 5.2 of A and of Theorem 2 above, introduce the 
family of step-functions £,(u) and define f(&,) =g(¢). In view of the uniform 
continuity of f, let 5>0 be such that 


| f(z) fy) | <1 for (x, y) 6. 
Let 0St,<t#.<1 and let m satisfy the condition 


1 1 
mf f | mé., — mé,,| du = 6. 
0 0 


Then we have 


m| — g(te)| = m| — | =| — | <1, 


whence 


1 
0 


i.e., g(t) satisfies a Lipschitz condition on the interval 0<t<1. Consider 


0 for 0 for 
= = { 
k for #=1, —k for t=1. 


Since (x1, x2) =O and (x1, —x2)=0, we have f(x1) =f(x2) =f(—x2) = —f(xe) =0; 
and from the additivity of f it follows that the value of f(x) is independent 
of the value of x(1). The sequence of step-functions z,(¢) employed in the 
proof of Theorem 5.2 of A will then have the property that in the present 
metric (Z,, #)—0, with # identical with x except at ¢=1 where it differs from x 
by 1+75(x)—Lj(x). The argument set forth in that proof then shows that 
f(x) can be expressed as f}x(t)dg(t), where g(t) is Lipschitzian on J; i.e., f(x) 
can be given the form (2), where ess super | a(t)| = Mis the Lipschitz modu- 
lus of g on J. The existence of a Lipschitz modulus for f on (BV), and its 


() For example, the linear functionals f(x) =kx (0) on the Euclidean space E; metrised 
with (x, y)= | x? —y3| , which is not homogeneous, do not satisfy a Lipschitz condition. 

(3) Compare the reasoning used in the proof of Theorem 2.4 of A and make use of the 
relation | Li(x) —Li(y)| s To(x—y); see inequality (4) of Adams and Lewy, On convergence in 
length, Duke Mathematical Journal, vol. i (1935), pp. 19-26. 


4 


1940] FUNCTIONALS ON THE SPACE (BV) 


equality to M, then follows from the relation 
M = | — g(te)| /| — | 
= | — | /| ts | 
supost<te<t | — S(Et2) | Ete) 
supz,ye (Bv),(2,u>0 | — f(y) | 


1 
SUP (BV),(z,y)>0 uf | « — y| dt 
0 


Since (CBV) and (AC) are dense in (BV), a functional f additive and uni- 
formly continuous on either subspace can be extended to be uniformly con- 
tinuous on (BV). As has been done in the proof of Theorem 5.3 of A, one may 
then show that the extended functional is additive on (BV). Consequently 
the form of f is determined as asserted. 

That the general form of the continuous additive functional on (BV) is (3) 
can be shown by essentially the same argument as has been used in the proof 
of Theorem 3 above. 

To determine the form of the continuous additive functional on (CBV), 
it should be noted that if the set P in Lemma 2 is of measure zero, x is a 
singular function. According to a theorem of Morse(*), since x is singular, 
(xn, x)—0 in the metric (1) implies (x,, x)—>0 in the metric (17); hence func- 
tions A, w exist such that in the metric (17), (A, wu) is <6 and /gh(t)d’ (t)dt 
— Joh(t)u’ (t)dt is =k/2. Since any non-vacuous perfect set contains such a 
set with measure zero, only trivial modifications in the proof of Theorem 4 
need now be made in order to establish the desired result. 

Since in (A C)» convergence in the metric (17) is equivalent(?”) to conver- 
gence in the metric (x, y) =79(x—y), it follows that the general form of the 
continuous additive functional on (AC) is (6) and on (AC) is (8). 

The converse statements concerning the uniform continuity of (2) on 
(BV) and the continuity of (3), (4), and (8) on (BV), (CBV), and (AC) re- 
spectively are very easily verified. 

The fact that, if x, is an arbitrary point of (AC), the functional (8) on 
(AC) satisfies a Lipschitz condition at x; only when h(t), and therefore 
g(t) =h(0) —A(t), satisfies a Lipschitz condition on J is a consequence of 


Lemma 3. The functional (8) has the following property when x; is any point 
of (AC): 


(8) See Morse, Convergence in variation and related topics, these Transactions, vol. 41 
(1937), pp. 48-83, Theorem 5.2. 


(27) The equivalence is a consequence of Theorems 4 and 5 of Adams and Lewy, loc. cit. 


97 
| 
| 


98 C. R. ADAMS AND A. P. MORSE [July 


supze (4c),(2,2,)>0 | — f(a) | a1) 
suposy<ysi | — h(te)| /| ts — 


Proof. In the light of remarks made in the paragraph following (7) it 
suffices to prove that the left member of (18) is = | h(b) —h(a)| /(b—a), where 
0<a<6b31, in four cases: (i) that in which a=0, b=1, which is trivially 
verified by taking x(t) =x:(¢)+1 for ¢ e J; (ii) that in which 0<a<b<1 witha 
and b points of the Lebesgue set(?*) for the function h; (iii) that in which 
a=0 and b, 0<db<1, is a point of the Lebesgue set for h; and (iv) that in 
which 6 =1 and a, 0<a<1, is a point of the Lebesgue set for h. 

To dispose of case (ii) let 0<5<(b—a)/2, let € be an arbitrary number 
(positive, negative, or zero), and consider the function 


x(t) for 0S? 
x(t) + e¢—a)/i for ast 
x(t) + € for a+64 
— et — 6)/6 for 


(18) 


a+, 


tsb. 


X5,e(t) = 


WA WA IA WA 


For each 6 and e we have 


at+é 
f h(t)e/s dt — h(t)e/s dt | 
a b—5 


a+6 
= | e[h(a) — + ef [h(t) — h(a) 


b— 
e| — h(a)| —| (6) —| 


where, as 6—0, 


m(6) = f | A(t) — h(a) | dt/s— 0, 
(19) 
b—8 
(28) See, for example, Titchmarsh, The Theory of Functions, Oxford, 1932, p. 364. For each 


h summable on J the “Lebesgue set” of points ¢ where 4 h(u) —h(t)| du/é tends to zero with 6 
has measure 1. 


| 
\ 
2 


1940] FUNCTIONALS ON THE SPACE (BV) 


For each fixed 5, the function 


(5, €) = — Lo(2:) 


a+é b 
-f {1+ [xi () + €]?} +f {1 + [xf () — 
b-8 


vanishes at e=0, and 0¢/de may easily be seen to exist(”*) for each e. If 
0/de=0 at we have 


| — fled! WO) — | 26) 

(xe, %1) €) — 0)]/e| 
with [6(65, €)—(6, 0) ]/e—0 as e—0. If at €=0, there exists a uni- 
lateral neighborhood of zero such that for ¢ in this neighborhood, L}(xs,.) is 
<L}(x1). Moreover it is evident that L}(x;,.) is a continuous function of ¢ 
which becomes positively infinite as e—-+ © or e—— ©. Hence there exists 
an €, say €;, for which L}(xs,.) =L}(x1). In fact the inequality (*°) 


(20) 


+ 


shows that for | e| = L2** (1) (x1), Li(xs,e) exceeds Li(x,); this gives an 
upper bound for | €s| and shows incidentally that ¢;—-0 with 6. Choosing 
€=€;, we have 


%1) b—a-—6 


(21) 


The inequality (18) now follows at once from (19), (20), and (21). 
Although cases (iii) and (iv) are not formally symmetric, it should suffice 
to examine one of them. In case (iii), for example, we may define 
for 
= — et — for |, 
+ for 
and find 


(?*) See, for example, Hobson, loc. cit., 2d edition, vol. 2, Cambridge, 1926, p. 355. 
(°) See Adams and Lewy, loc. cit., inequalities (2) and (3). 


| 
| 


C. R. ADAMS AND A. P. MORSE 


J | — dt =| «| 8/2), 


eh(0) — h(t)e/6 at | 


e[k(0) — h(b)]—e f — n(b) |dt/6 | 


| «| - | (6) — 2(0)| —| e| n2(8), 


whence we may proceed as before in case (ii). 

To complete the proof of the results stated in the theorem, it is sufficient 
to make the following remark. Let f be a continuous additive functional on 
(BV) [(CBV)], and let y; be an arbitrary point of (BV) [(CBV)]. Set 


x(t) = fox (u)du, zi(t) = y(t) — tel, 


so that x; is the absolutely continuous part, and 2; the singular part, of the 
function y;. Then we have, for ye (BV) [ye (CBV)], 
f(y — 91) | Ay, 94) 
supze (ac),(z,2,)>0| + a1 — — 21) | /(x + 21, + 21) 
= supze (ac),(2,2,)>0| f(x — mi) | /(x, 1), 
since | L5(x+21) =| Lo(x) +79 (a1) —Lo (x1) — 


BROWN UNIVERSITY, 
PROVIDENCE, R. I., 

THE UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


100 


A NEW SPECIAL FORM OF THE LINEAR 
ELEMENT OF A SURFACE 


BY 
JESSE DOUGLAS 


1. Introduction and statement of results. The great circles of a sphere 
form a family cf ©? curves having the following three important properties: 

(1) They are geodesics of the sphere. 

(2) They are a linear system; that is, a point transformation exists which 
converts them into the straight lines of a plane. Indeed, central projection 
of the sphere on any plane not passing through its center will accomplish 
such a transformation. An equivalent statement is that it is possible to intro- 
duce coordinates u, v on the spherical surface so that the totality of great 
circles is represented by the general linear equation: au+bv+c=0. 

(3) The angular excess of any triangle ABC formed by great circles is propor- 
tional to the area of the triangle: 


(1.1) E=A+B+C-r= kif, 


where the factor of proportionality k is equal to the Gaussian curvature of 
the sphere: k=1/R?. 

It is evident that all the geometric entities and properties involved in 
these three statements are invariant under any bending or isometric trans- 
formation of the spherical surface(!) together with its great circles; this means 
a point transformation into a family of ©? curves upon another surface so 
that ds=ds’, where ds denotes the element of length of the sphere and ds’ 
the corresponding element of the transformed surface. According to a funda- 
mental theorem of Gauss, the Gaussian curvature K of the transformed sur- 
face must be the same as that of the sphere, therefore constant. Evidently, 
too, the geodesics of the sphere go over into the geodesics of the transformed 
surface. It follows that the three stated properties are possessed also by the 
geodesics of any surface of constant Gaussian curvature. 

Let us denote by (7, S) the geometric configuration consisting of a family 
F of 2? curves upon a surface S. Then it is obvious that, in respect of the 
possession or non-possession of any of our three properties, the configuration 
(F, S) is completely equivalent to any configuration (#’, S’) derived there- 
from by isometric transformation. Any two such isometric configurations will 


Presented to the Society, April 27, 1940; received by the editors March 5, 1940. This paper 
was received by the editors of the Annals of Mathematics, May 11, 1939, accepted by them, 
and later transferred to these Transactions. 

() Meaning a properly limited region of the spherical surface. As is well known, the sphere 
as a whole is indeformable. In general, all our considerations will be local or differential- 
geometric. 


101 


| 


102 JESSE DOUGLAS : [July 


therefore be regarded as essentially identical. In other words, all that is rele- 
vant concerning the surface S is its first fundamental form 


(1.2) ds? = Edu? + 2Fdudv + Gdv?. 


The family ¥ may always be defined by a differential equation of second 
order: 


(1.3) v’ = $(u, 2, v’) (v’ = dv/du, v'’ = d*v/du?). 


Thus any configuration (¥, S) is represented analytically by a system of func- 
tions [E(u, v), F(u, v), G(u, v), d(u, v, v’)]. 

With every two properties that may be selected from the three stated at 
the beginning, we may associate a corresponding converse problem. Thus we 
may ask for all configurations (¥, S) which have: 

(a) properties (1) and (3), 

(b) properties (1) and (2), 

(c) properties (2) and (3). 

The answer to the converse question (a) is classical. According to a theo- 
rem of Gauss, the angular excess E of any triangle ABC formed by three 
geodesics of a surface S is given by the formula(?) 


e- ff Kdw over ABC, 


where dw denotes the element of area. By the law of the mean, this gives 
E€=K(m)A, where K(m) denotes the value of the Gaussian curvature at 
some point m of ABC, while <4 denotes the area of this triangle. It follows 
immediately that, as the triangle ABC shrinks to any fixed point of S, 


lim = K(p). 


Property (3) then implies that the Gaussian curvature of the surface S is con- 
stant: K(p) =k for every point p of S. That the family 7 consists of the geo- 
desics of S is part of the data of problem (a). 

The answer to the converse question (b) is also classical, having been 
given by Beltrami in 1865(*). He proved the theorem: if a surface S can be 
represented point by point on a plane so that the geodesics of S correspond to the 
straight lines of the plane, then S has constant Gaussian curvature. Thus, again, 
the only solution of the converse problem is the one which is known a priori. 

The converse problem (c). It is curious that the converse problem (c) has 
not hitherto been studied. Here I have found the solution (7, S) to be more 
general than the geodesics of a surface of constant curvature. In fact, the 


(?) In formula (2.2) of the next section, let 1/p=0, expressing the geodesic character of the 
sides of the triangle. 
(*) E. Beltrami, Opere Matematiche, vol. 1, pp. 262-280. 


4 


1940] THE LINEAR ELEMENT OF A SURFACE 103 


complete solution is given by the formulas which follow, whose derivation 
constitutes the purpose of the present paper. 

In formulating our results, it is convenient to use—instead of general co- 
ordinates u, v upon the surface S, wherein ds? has the form (1.2)—minimal 
coordinates, wherein 


(1.4) ds? = 2Fdudv. 


The characteristic property of minimal coordinates is that the coordinate 
lines u=const., v=const. have zero length(‘), or are the minimal lines of the 
surface S. Such coordinates are determined only up to an arbitrary trans- 
formation u,=@(u), v1, which preserves the minimal character of the 
coordinate lines. 
Let Ui, U2 denote any two functions of u alone, and V;, V2 any two func- 
tions of v alone. Form the determinants 
(1.5) Ui U= Vi 
U2+V2 U? U2t+V2 V? 
where the accent denotes differentiation. We always suppose U;, U24"Vi, Ve 
so chosen that I1#0, I140. Then we shall prove that the most general con- 
figuration (¥, S) having the properties (2), (3) is represented by the formulas: 


(1.6) F: = Bo’ + Cv"? 


where 


(1.6’) = — (log I — 2 log II), = — (2 log I — log II), 
Ou dv 


while, for the surface S, 


a 
S: log I + log II). 
(1.7) as + log IT) 


With the help of a simple determinant transformation (*), we find 
a? II Uy Uy’ 3? 

(1.8) log I = —- , log II = —- 

I? | UZ 2’ Vi Vz’ 
hence the expanded form of (1.7) is 
U U I V 
(1.7’) 
Uz Uz’ II? | V? 
The case k=0. In interpreting these results, it is important to distinguish 


(*) Of course, these coordinate lines must be imaginary. 
(®) Formula (3.19). 


| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 


104 JESSE DOUGLAS : [July 


the case k=0 from k0. If k=0, property (3) becomes the statement that 
the sum of the angles of every triangle of 7 is two right angles. In addition, F 
is required to be linear, by property (2). Now, an arbitrary surface S is ca- 
pable of conformal representation upon a plane P, and in infinitely many ways, 
since we may combine any given conformal representation of S on P with an 
arbitrary conformal transformation of P into itself: u’+-iv’ =f(u+iv). Let F 
denote the family of ©? curves on S which results from the family of all 
straight lines of the plane P by any conformal map of S on P. Then F is 
obviously linear, and also the angle-sum of every triangle of F is two right 
angles, since these are conformally invariant properties which belong to the 
straight lines. Thus, an arbitrary surface S carries infinitely many families of 
curves F which are linear and in which every triangle has an angle-sum of 
two right angles. 

This finds expression in the formula (1.7’) by the circumstance that when 
k=0 this formula implies no restriction on the function F, that is, on the ds? 
of S, but rather only a condition on the functions U;, U2, Vi, Ve which de- 
termine the family 7. Indeed, if k=0, it follows from (1.7) that 


(1.9) I-II = UsVs, 
and from (1.7’) that either 


(1.9”) 


where 

Uj? Vy! jus 
veri” 
or else that 
(1.10) 


By (1.9) and (1.9’), 
(1.11) 
therefore 

pail log I = 0, 
whence by (1.8), since by hypothesis I ~0, II#0, we deduce 
Ui Us") | Vi vi’ 
Us Us! Vs ve" 


(1.12) 


| 
| 

| | 


1940] THE LINEAR ELEMENT OF A SURFACE 105 


(1.10) is the same as (1.12), which therefore subsists in either case. (1.12) 
implies the existence of linear relations with constant coefficients: 


(1.13) 61U1 + c2U2 = C3, 


where either or and either c/ or c/ 
It is evident by the defining formulas (1.5) that, under the conditions 
(1.13), the functions I, II must have the forms (1.11). Therefore, by (1.6), 


B=U, C=V; 
accordingly, the differential equations (1.6) of ¥ have the form 
(1.14) = + Vo". 


It is easily verified that this is the general form of differential equation 
for a family ¥ derivable by conformal transformation: u,=¢(u), 1.=y/(v), 
from the straight lines of a plane: vj’ =Q—explicitly, U=o’’(u)/’(u), 
V=—y’''(v)/v'(v). In summary, we have a proof of the following theorem: 

If a family F of ©? curves on a surface S is linear, and the sum of the angles 
in every triangle of F is two right angles, then F must be a conformal image of 
the «* straight lines of a plane. 

In a previous paper(*), the author proved this theorem synthetically. The 
first statement and proof is an analytic one by E. Kasner(’). 

The case k#0. Of more interest is the case k#~0. Then formula (1.7) or 
(1.7’) really specializes the surface S: its ds? must have, in minimal coordi- 
nates, the form 


2 


1 
ds? = — log I + log II)dudv 
k dudv (log Il) 


1 | UY I 


(1.15) 


Us Us! Vi 


Upon all and only such surfaces S can curve families 7 be found with proper- 
ties (2) and (3). 

In order that this form of the surface S shall not be degenerate, it is nec- 
essary and sufficient (besides I ~0, II¥0) that F0, or 


2 
(log I + log II) ¥ 0. 
But in the contrary case, we have seen by the calculations (1.9)—(1.13) that 


we must have (1.12) or its equivalent (1.13). Conversely, it is evident that 
(1.12) implies F=0. 


(*) Number 2 of the list of references at the end. 
(7) Reference [1]. 


i 
| 
| 
vs! } | 
2 
i 
| 
4 


106 JESSE DOUGLAS . [July 


Hence, under the hypothesis I~0, II40, the formula (1.15) will give a 
nondegenerate surface S when and only when linear relations of the form (1.13) 
do not subsist simuitaneously between U;, U2 and between Vi, Ve. If Ui, Us 
are not both constant and Vj, V2 are not both constant, the condition that 
relations of the form (1.13) or (1.12) shall not hold simultaneously is sufficient 
to guarantee in addition that I ~0, II¥0. 

This completes our description of the special form of the linear element 
of the surface S signified by the title of the present paper. 

An indication that this type of surface S is more general than one of con- 
stant curvature results by a count of arbitrary functions. The most general 
form of the ds? of a surface of constant curvature c referred to minimal co- 
ordinates is(*) 

(1.16) ds? = —__—_—___d 
c(U — V)? 


thus involving, besides the arbitrary constant c, only the two arbitrary func- 
tions U of uand V of v, which determine the distribution of parametric values 
u, v over the two systems of minimal lines respectively. The formula (1.15), 
on the other hand, involves four general functions U;, U2, Vi, V2, subject only 
to the slight restrictions of linear independence which we have mentioned. Of 
these four functions, two correspond to an arbitrary transformation u,=9¢(u), 
v, = (v) on the surface S which conserves the minimal lines, so that only two 
of the arbitrary functions are really effective in varying the form of S. We 
may say that, if isometric surfaces are regarded as identical, there are only «} 
surfaces of constant curvature, depending on the value c of this curvature, 
whereas the category of surfaces S with properties (2) and (3) involves two 
arbitrary functions of one variable. 

This indication is, of course, not completely decisive, since there remains 
the question of whether U;, U2, Vi, V2 are all essential. To obtain a definite 
proof that the formula (1.7’) contains surfaces mot of constant curvature, we 
may calculate the Gaussian curvature by the formula 

2 
F dudv F8 


v, 


The result is a rational function of Ui, U2, Vi, V2 and their derivatives of the 
first three orders. A partial calculation suffices to show that this rational func- 
tion does not reduce identically to a constant when all the quantities men- 
tioned are considered as independent variables—as they may be, since the 
functions U;, Us, Vi, V2 are arbitrary, and they and any finite number of 
their derivatives are therefore capable of taking any assigned values for any 
finite number of given values of (u, v). Consequently, we can arrange to give 


(*) Cf. G. Darboux, Théorie Générale des Surfaces, 1887 edition, vol. 1, p. 30. Writex=U 
y= V in formula (1), p. 30. 


‘ 


1940] THE LINEAR ELEMENT OF A SURFACE 107 


these functions and those of their derivatives which appear in the expression 
for K such particular values at any two chosen points (11, 01), (#2, v2) that 
K (ui, 01) ~K (ue, v2); therefore K will not be a constant. 

2. Conditions for the property € =kc4. We begin the proof of the results 
stated in §1 by recalling the formula of Gauss-Bonnet(*). If I denote any 
closed curve with continuously turning tangent, bounding a region R, then 


ds 
rp R 


where 1/p is the geodesic curvature of I’. In case the curve [ has corners at 
the points P; (¢=1, 2, - - - , m), then this formula must be modified as follows: 


t=1 


Here 6; represents the angle, taken with proper sign, between the sensed 
tangents to the two arcs of I which form the corner at P;. _ 

Let us apply formula (2.1) to any triangle ABC formed by three curves 
of 7. The boundary I of this triangle has corners at P;, Ps, P;=A, B, C, and 
6,=1—A, 6.=27—B, 0;=2—C, where A, B, C denote the interior angles of 
the triangle. Consequently, by substitution in (2.1), 


ds 
(2.2) f Kdw =A+B+C—r=€E. 
rp ABC 


By property (3), 
2.3 E = kA = kdw; 
2.9) 


therefore 


(2.4) Kyu = ff K)Wdudo, 


since 
(2.5) dw = Wdudv, 


where W=(EG— 

Every polygon whose sides are curves of 7 can be decomposed into tri- 
angles of 7. It follows, by the additive nature of both contour and surface 
integration, that the formula (2.4) applies to any polygon of 7; that is, if T 
denotes the boundary and R the interior of any polygon of F, we have 


(2.6) = — K)Wdudo. 


(*°) See W. Blaschke, Vorlesungen iiber Differentialgeometrie, 1921, p. 108. 


q 
\ 
f 
i 
| 
| 
q 
| 
| 
i 


108 JESSE DOUGLAS ‘ [July 


By Green’s theorem, the surface integral over R can be expressed as a 
contour integral over I: 


(2.7) Sf.o — K)Wdudv = + Qido, 


where P;, Q; are any two functions of u, v which obey 


oP 
(2.8) 
Ou Ov 
—the existence of such functions is obvious. 
Relations (2.6) and (2.7) give 


(2.9) — Pia — = 0 


for every polygon I of F. This implies that the same integral taken over 
any polygonal path between any two fixed points of S does not depe::d on the 
path itself but only on its end-points. (By a “polygonal path” we mean one 
composed of a finite number of arcs of curves of F.) According to a standard 


theorem(?*), it follows that the element of integration in (2.9) is an exact 
differential : 


ds 
— — Pidu — Qidv = + 
p 


where the subscripts denote partial differentiation of the arbitrary function 
A(u, v). Thus we have 


ds 
(2.10) — = Pdu+ Qdv, 
p 


where P=P,+ Au, Q=Q1+A,. Obviously, P, Q also obey the condition (2.8): 
(2.11) Qu — P, = (k— K)W, 


since \,,, cancels in the process of substitution. 

Conversely, it is easily seen that if (2.10) is obeyed along every curve of a 
family 7, and P, Q are related by (2.11), then 7 has property (3), as expressed 
by (2.3). 

Formula (2.10) by itself defines a type of curve family called a velocity 
family(“). Thus property (3) is characteristic of a particular kind of velocity 
family—one where formula (2.11) is obeyed. 


(1°) It is easily seen to be sufficient for the application of this theorem that the condition 
of independence of the path ‘of integration apply merely to polygonal paths of 7. 

(4) The name is due to E. Kasner, these Transactions, vol. 10 (1909), p. 213. The geodesics 
of a Weyl metric are a general velocity family; see H. Weyl, Raum, Zeit, Materie, 3d edition, 


» 


1940} THE LINEAR ELEMENT OF A SURFACE 109 


In minimal coordinates, where ds?=2Fdudv, the geodesic curvature 1/p 
of any curve v=v(u) is given by(!?) 
ds + (F,/F)v’ — (F./F)v? 


(2.12) 


p 2iv 
Therefore, for a velocity family, we have by (2.10): 
(2.13) v’ = Bo’ + Cv’, 
where 
B =F,/F — 2iP = (log F)y — 2iP, 
= —F,/F — 210 = — (log F), — 20. 


Conversely, a family ¥ whose differential equation in minimal coordinates 
is of the type (2.13), where B, C are any functions of u, v, obeys (2.10) with 


P, Q defined by (2.14). That is: the form (2.13) is characteristic of velocity fami- 
lies. 


Let us now apply the condition (2.11) by calculating 
(2.15) Cy — By = — 2A(log — — 


(2.14) 


In minimal coordinates, where E=0, G=0, W=(EG— F*)'/2=iF, the Gaus- 
sian curvature K is expressed by (1.17), so that the condition (2.11) becomes 


(2.16) Qu — Py = ikF + i(log F)us. 
This gives, when substituted in (2.15), 
(2.17) Cy — B, = 2kF. 


Conversely, (2.17) gives (2.16) when substituted in the identity (2.15). 


In summary: property (3) is expressed in minimal coordinates by the formu- 
las (2.13), (2.17). 


3. Linearity of the family 7. We now have to impose the additional prop- 
erty (2) of linearity on the family 7. 


If we apply an arbitrary coordinate transformation 


(3. 1) = o(u, v(u, v’) 


to F, the effect on the derivatives v’, v’’ is as follows: 


Vu + 
3.2 = ————_; 
( ) $0" 


1919, p. 112. Cf. also C. H. Rowe [4]. The term “zyklisches Netz” used by J. Radon, following 
Blaschke, denotes the same thing as a velocity family; see J. Radon [3]. 
(#) Blaschke, loc. cit., p. 117. Write E=0, G=0, u’=1, u’’=0. 


E 


JESSE DOUGLAS 


{ (bu + $00’) (Wun + Wurd’ + + 
(Vu + (Guu + hurd’ + Pov’? + + 


Suppose that after this transformation the finite equation of 7 has the linear 
form v,=a2,+5, or the differential equation of ¥ becomes vj’ ==0. Then in 
the original coordinate system (wu, v) the differential equation of 7 is, by (3.3), 


(3.4) v’ = A + Bo’ + Co’? + Dv’? 


where 


Vubuu = Pub uu 
A 


A 


A A 


A A 


A 


D 


A= ou» — 0. 


This is, consequently, the general form of the differential equation of a linear 
family in any system of coordinates. 

If now the coordinates are minimal, then the necessary and sufficient con- 
dition for ¥ to be afvelocity family is, by comparison of (3.4) with (2.13), 
A=0, D=0fthat is, 


(3.6) Vubun = 0, = 0. 
This gives 
Pov 
(3.7) ee, 
do Vo 
or 
(3.8) (log (log Vudu, o= (log Gv)» (log Vo)o- 
We calculate 


Au (dub uv Vudur) + (Puu¥» Vuuho) bub uv Vubuv 
(log = + = + 


similarly 


Ay VoPur 
= + 


0, 


log A), = 
(log 4) 


j 
110 [July 
| 
- 


1940] THE LINEAR ELEMENT OF A SURFACE 


so that the second and third equations of (3.5) can be written 
(3.9) B= — 2(log A)u + 3p, C = 2(log A), — 3c. 


The partial differential equations (3.6) are easily integrated, and the re- 
sult may be written in the form 


(3.10) Ui¢ + Uy = 1, Vid + Vy = — 1; 
therefore ¢, Y must have the forms 


U2+ Ve UitVi 
= = — 
UiV2 Ue; UiV2 


(3.11) 


where U, V2— U2Vi+0; that is, U;/U2 and Vi/V2 are not equal to the same 
constant, nor do we have U,=0 and U2=0 or Vi=0 and V2=0. 
From (3.11) we calculate 


| UY | UitVi Vi 
U2et+V2 U? Uet+Ve2 Ve 
2 dy = 2 


Vi |? Ui Vi |? 
Uz Ve U2 Ve 


Uitvi UY | VW 


U2e+V2 Uzt+Ve2 Ve? 
Ue Ve Uz Ve 


therefore, by the last equation of (3.5), 


UL) Vi 
U, Vi |*® 


(3.13) A= 


By substitution of (3.12) in (3.8), 


— log 
Ou U2+ V2 


UitVi 
o = — log 
dv U2+ V2 


p = 
(3.14) 


111 
(3.12) 
| 
| 
UY 0 Ui V 
Uz Ou U2 Vo 
Vi 0 U1 V 
Ve Ov U2 V2 


112 JESSE DOUGLAS 


Substituting (3.13), (3.14) in (3.9), we find — 


Uitvi 
— 2—lo 

Uf au | Us t+ Vs 

U Vi Uf 0 U V 

C= 2 tog| itVi U1 it Vi 

Ov U2+ V2 UZ U2+ V2 


that is, as abbreviated by the notation (1.5), 


= — lo 


(3.15) 


0 
(3.16) = — (log I -- 2 log II), = — (2 log I — log II). 
Ou dv 


We thus have the result: 

In order that a velocity family expressed in minimal coordinates be linear, 
it is necessary and sufficient that the coefficients B, C in (2.13) have the special 
form (3.15) or (3.16). 

To complete the imposition of property (3), in addition to the property | 
(2) of linearity, we must particularize our velocity family by the additional 
condition (2.17): C, -—B,=2kF. Substituting from (3.16), this gives the result 
stated in formula (1.7): 


2 


3.17 2kF = log I + log II). 
(3.17) 2kF = —— (log I + log 11) 


We find by direct calculation: 
Uitevi Ui 
US | Vi U;" 
Uitvi Uy 
Uzt+V2 UZ 
UitvVi UY 
| Us 
UitvVi |? 
U2t+V2 UZ 


The determinants which appear in the numerators are among the six in the 
matrix 


UitVi Ui Vi Uy’ 
Uz+V2 Ui Vi U? 


which obey the well known identity(*): 


? 


(8) The same as the one which governs Pliicker line coordinates, being obeyed by the six 
determinants of second order in any two-by-four matrix. 


[July 
Vi 
vil’ 
| | Vi | 
Ve 
| 

( vy | 

Vz 


THE LINEAR ELEMENT OF A SURFACE 


Ui+V: Uf Vi | UitVi | Ui Vi 
U2+V2 Vi Ug’ Ui Vi 
| Vi | Ur 


Uzt+V: Vi Us 


(3.19) 


Therefore 
(3.20) 


Similarly we find 


a I | vs vy 
(3.21) log II = 


These are the formulas stated as (1.8); substituted in (3.17), they give the 
expanded form (1.7’) for 2kF. 

The proof of our main results is thus completed. 

4. General coordinates. It is interesting to see how our formulas look in 
general coordinates u, v, instead of minimal coordinates. 

Using the general formula(*) for geodesic curvature 1/p, the condition 
(2.10) for a velocity family gives the following characteristic differential equa- 
tion for such a family: 


Wo” = W(P + Qo’)(E + 2Fv’ + 
(4.1) + (F + Go’) + Eo’ + (F, — 
— (E+ Fo’) [(Fu — + + 3G,0'?]. 


The imposition of property (3) is completed by (2.11), where K is to be 
thought of as expressed in terms of E, F, G and their first and second partial 
derivatives("): 


(4.2) Qu — Py = (k— K)W. 
We have also to express property (2), of linearity, in general coordinates. 
It is a known result(!*) that the most general linear family has the form 
(4.3) = A + Bo’ + Cv’? + 
where A, B, C, D are functions of u, v which obey the conditions 
(AC — Ay)» — (AD + 3Cu — 


(4.4) 
+ B(AD + 3C, — 3B,) — A(BD + D,) = 0, 
“4) Blaschke, loc. cit., p. 117. 
(5) See formula (4.11). 
@6) Due to Lie and R. Liouville. See E. Kasner [1]. 


1940] 113 
Oudv I? | Ug Ug’ 


114 JESSE DOUGLAS [July 


(4.5) (BD + Du)u + (AD + — 
+ C(AD + 9C. — 3B.) — D(AC — A,) = 0. 
The formula (4.1) is of the type (4.3) with 
A = {WPE+}3FE, — E(F. — 3E£,)} / W?, 
B = {2WPF + WOE+FE, + 3GE, — F(F. — }£,) — EG.} / W?, 
C = {WPG + 2WOF +GE, + F(F, — 4G.) — }£G, — FG,} / W?, 
D = {WOG+G(F, — 4G.) — 4FG,} / W?. 
The conditions (4.4), (4.5) of linearity give two partial differential equations 
involving E, F, G, P, Q. 


In summary: configurations (F, S) having properties (2) and (3) are charac- 
terized in general coordinates by 


S: ds? = Edu? + 2Fdudv + dvG?, 
J: of type (4.1), 


where the five functions E, F, G, P, Q obey the three partial differential equations 
(4.2), (4.4), (4.5), im which A, B, C, D are defined by (4.6). These equations are 
of the third order in E, F, G and the second order in P, Q. 

Of course, the general solution is obtainable by applying an arbitrary 
transformation u,;=¢(u, v), 1. =y(u, v) to the formulas based on minimal co- 
ordinates. 

We may inquire also as to the form our equations have in the particular 
coordinate system (u, v) where the equations of 7 are linear(!”): v=au+5, or 
v’’=0. In (4.6), we have then to write 


(4.7) A=0, B=0, C=0, D=0, 


(4.6) 


which, in addition to (4.2), give a system of five conditions on E, F, G, P, Q. 
From these we easily eliminate P, Q and obtain the following three conditions 
on E, F, G: 


Ey 
E 


(4.8) — F?) + + FF, BF, 486, + = 0, 


.9) 


Ey 
+ 3FG, + FF, — GF, — GE, + HG = 0, 


E E, 
—|F Fy, F, | = 4kW*. 
G G, G, 


0 FE, — EF, 0 FG, —GF, 
(4.10) ——— + 2W*— ———_- 
Ov WE Ou WG 


(7) Of course, this coordinate system is determined only up to an arbitrary projective trans- 
formation of 4, v. 


| 


1940] THE LINEAR ELEMENT OF A SURFACE 115 


In the derivation of (4.10), we use the formula of G. Frobenius for Gaussian 
curvature 


E, —Fy F, —G, 
(4.11) 


W 


In summary, (4.8)—(4.10) are necessary and sufficient conditions on the 
E, F, G of a surface in order that the curve family defined by the general 
linear equation, v=au+b, have the property E=kc/. 

(4.8) and (4.9) are necessary and sufficient in order that v=au+6 shall be 
a velocity family. 

5. Geometric construction for the property €=k-/. In the case k=0, it 
is well known that the curve families ¥ which have the property: E=0 or 
A+B-+C=rz for every triangle ABC of F, are exactly the isogonal families(**). 
These consist of the totality of ©? trajectories under every possible constant 
angle 6 of any given family a of ©! curves. 

It is easy to generalize this construction to the case k#0. Take any net 
of curves upon an arbitrary surface S, composed of two families a, B of «1 
curves. Construct a trajectory T of the family a, not under constant angle, 
but rather so that the angle @ between T and a decreases by k times the ele- 
ment of area swept out by the arc of the curve 8, of 8 which passes through 
the point p describing T and extends to the intersection m of 8, with any 
chosen fixed curve ay of a. 

That is—with reference to a figure easily drawn by the reader—we have, 
in integrated form, the law 


(5.1) A; = — k-area Pipemime 


for the construction of T. Evidently, this law determines the formation of T, 
element by element, when any initial point and direction are given; there- 
fore the totality of trajectories T is a family 7 of ~* curves. It is very easy to 
give by means of a figure a proof of the property: E=kcA4 for every triangle 
of F. 

It remains to be shown that every family ¥ with the property E=kc4 is 
obtainable by a construction of the type just described. This is readily done 
by the following converse reasoning. Construct the pencil II of curves of ¥ 
through a fixed point p. In the region R covered by II, construct any family 8 
of «1! curves all of which intersect a fixed base curve a. At each point of R 
construct a direction 6 according to the following law: (i) at the points of a, 
5 shall coincide with the tangential direction to a»; (ii) the angle @ between II 
and 6 shall vary according to the law (5.1). We thus have a field of directions 


(8) Blaschke, loc. cit.; p. 79. 


(**) G. Scheffers, Jsogonalkurven, Aquitangentialkurven und komplexe Zahlen, Mathematische 
Annalen, vol. 60 (1905), p. 504. 


116 JESSE DOUGLAS 


5, whose integration gives a family of ©! curves a, including ap. If now T is 
any curve of the family 7 lying in the region R, it is evident, by applying the 
property €=kc/ to the triangle formed by any arc pipf2 of T and the curves 
Phi, Pp: of Il, that T traverses a according to the law (5.1). 

Thus the law (5.1) holds as long as T lies in the region R covered by II. 
We can extend this region by applying the same reasoning to the pencil of 
curves of ¥ which pass through any other fixed point p’ of R, and repeating 
this procedure any finite number of times. 

6. Higher dimensions, >2. We conclude with a statement of the analo- 
gous problem for higher dimensional spaces, which we hope to consider in a 
future paper. 

Let F denote a linear family of curves in a space of n>2 dimensions; that is, 
let F be depictable as the totality of straight lines of a flat projective n-space P. 
It is required to impose on the space P a Riemannian metric R: ds*=g;;dx‘dx', 
so that, in every triangle of P, E=kA with k a preassigned nonzero constant, 
angles and areas being measured according to R. 

Certainly, a sufficient condition is that the space R have constant Rieman- 
nian curvature and that F consist of its geodesics. In other words, R shall be 
the Cayley metric based on any fixed quadric Q: dist ab= {2( —k)¥2}—1 
-log (abpipe), where 1, p2 are the intersection points of the line ad with Q, and 
the parenthesis denotes an anharmonic ratio. For the Cayley metric is a 
typical one of constant curvature, and the straight lines are its geodesics. 

It remains to be seen whether, for m >2, the Cayley metric is the most gen- 


eral one which can be imposed on the projective space P so that E=keA. We 
reserve this problem for future consideration. 

It may be remarked that for k=0 and n>2 the property €=0—that is, 
the property that the angle-sum of every triangle of 7 is two right angles— 
implies that 7 is linear(”). The author has proved that, furthermore, it must 
be possible to represent the Riemann space R conformally on a euclidean 
space E so that 7 corresponds to the straight lines of E(?°). 


REFERENCES 

1. E. Kasner, A characteristic property of isothermal systems of curves, Mathematische 
Annalen, vol. 59 (1904), pp. 352-354. 

2. J. Douglas, A criterion for the conformal equivalence of a Riemann space to a euclidean 
space, these Transactions, vol. 27 (1925), pp. 299-306. 

3. J. Radon, Kurvennetze auf Flichen und im Raume von Riemann, Abhandlungen aus dem 
mathematischen Seminar der Hamburgischen Universitat, vol. 5 (1927), pp. 45-53. 

4. C. H. Rowe, On certain systems of curves in Riemannian space, Journal de Mathé- 
matiques Pures et Appliquées, vol. 12 (1933), pp. 283-308. 

BRook.yn, N. Y. 


(2°) Reference [2]. 


. 

g 

| 


ON STRONG SUMMABILITY OF FOURIER SERIES 


BY 
OTTO SZASZ 


1. A series A,, or the corresponding sequence of partial sums s, =)_3A,, 
n=(0, 1, 2,---, is said to be strongly summable (C, 1) with index k to the 
sum s if k>0 and 


1 n 
1 li y—s|*=0(?). 


It follows from Hélder’s inequality that the larger k the stronger is the asser- 
tion (1.1). Furthermore, for k=1, (1.1) evidently implies (C, 1) summability 
to the sum s. 


Suppose now that f(¢) is a periodic function of the class L. Let its Fourier 
series be 


(1.2) f(t) ~ $a0 + y (a, cos vt + b, sin vt) = > A,(#); 

let 

(1.3) t) = 4{f(x +2) + f(x — — 2s}. 
Hardy and Littlewood proved (1913): 


THEOREM I. The Fourier series of an integrable function f(t) is strongly 
summable (C, 1) with index 2 at a point x if f(t) is of integrable square in a 
neighborhood of x and if for some s 


(1.4) f {o(x, }%du = o(t) as t1Q. 


In this paper we shall restrict ourselves to the index k =2, and speak sim- 
ply of this case as “strong summability.” For generalizations of Theorem I 
and for further references consult Hardy and Littlewood [2] and Zygmund 
[5, chap. 10]. 

For the special case in which ¢(t)—20 ast | 0, Fejér [1] recently gave two 
new proofs of the strong summability of the series (1.2) at t=x. We shall 
simplify his device and use it to give two new proofs of Theorem I. The es- 
sence of Fejér’s method is to introduce double integrals with positive kernels 
while using the (C, 3) and Abel summability methods. Replacing the partial 
sums Ss, by S,—4A, we get simpler (and also positive) kernels. 

Presented to the Society, December 26, 1939; received by the editors February 17, 1940. 


(*) For generalizations to other summability methods cf. [3, §§7, 8 and 11]. Numbers in 
brackets refer to the bibliography at the end of the paper. 


117 


118 OTTO SZASZ : [July 


In the last section we prove still another theorem of Hardy and Little- 
wood [2]: 


Tueoreo II. If 
(1.5) | (x, | du = 

then 


Our proof is shorter and simpler, not involving complex function theory. 
Hardy and Littlewood also proved, by constructing examples, that (1.6) is 
the sharpest asymptotic estimate implied by the assumption (1.5). 

2. We prove first the following lemma. 


LemMA. Let s* =0, s,*=s,—3An, n=1,2,--- if lim,.. An=0, and if one 
of the sequences Sn, S,* ts strongly summable, so is the other. 


This follows from the identities 


n n 1 n 


n 1 n 
> —s)+— DA. 
1 41 


In view of this lemma, we may deal with s,*(x) instead of s,(x) while dis- 
cussing the series (1.2). Now 


1 1 
(2.1) s,*(x) = —f f(« + 4) cot $¢ sin nt dt = —f ¥(x, 4) cot $¢ sin nt dt, 


where (x, t) =4{f(x+2) —f(x—t)} =y/(t). Hence 


1 
Sa*(x)? = f f ¥(y(u) cot cot $u sin mt sin nu didu, 
Yo 


and 
(2.2) Si = f f cot cot Ra(t, u)dtdu, 


where R,(t, u) =>_7(n+1—v)? sin vt sin vu. 


= 
| 
4 
| 


1940] FOURIER SERIES 119 


If f(¢)=1, then A,(t)=1, v=1, 2,---+, and s*(x)=1, v=1, 2,---+ ; hence 
from (2.2) 


(2.3) (n+1—»)?= fox cot R,(t, u)dtdu 


= §n(n + 1)(2n + 1). 
Now 
(2.4) R,(t, u) > 0 for0 


the proof is elementary (cf. [4, §2]). 
As a first application of (2.2), (2.3) and (2.4) we get: 


If |f(@)| $1 im |t| Sa, then (x)? 
Also from (2.1) 


1 
i= —f cot 4¢ sin nt dt, 


hence 
1 

(2.5) —f $(x, #) cot sin nt dt, 
Jo 

and, writing for p(x, t), 


n 1 
(n+ 1 — v)*(s* — s)? = $(t)o(u) cot $4 cot 4u R,(t, u)didu 


= I,(¢). 
Now (2.4) yields 
(2.6) | | 


whenever |¢(t)| S(t) in 0<t<z. 

The proof of strong summability at a point where ¢$(x, #)—0 as ¢ | 0 now 
follows as in Fejér’s method. We note first that I,(@)=0(n*) as n> isa 
necessary and sufficient condition for the strong summability of the series 
(1.2) at t=x, or, what is the same, of the cosine series of ¢(t) at t=0. This 
follows from the following general inequalities for an arbitrary sequence of 
positive quantities p, 20: 


2.7) (2n — »)*%,. 


Next, (2.6) yields the following theorem: 


| 
i 
| 


120 OTTO SZASZ : [July 


Whenever | p(t)| S(t) in 0<t<r, the strong summability of the cosine se- 
ries of b(t) at t=0 implies that of the series of $(t). 


If now ¢(t)—0 as ¢| 0, then there is an interval 0 #6 in which ¢(¢) 
is bounded. We choose the majorant function ¢ to be 


max |¢(r)| if OS¢<5 
= Josrst 

| | if 6<t<r. 
Now @ is continuous at t=0 and monotonic in 0 <t <5; hence its cosine series 
converges at ¢=0, and it is, consequently, strongly summable. Thus the series 


(1.2) is strongly summable at t=. 
3. The symmetry of the integrand gives 


1 1 2 | 
10) f tiff 
us OStSusr OSustsr OStSusr 


Furthermore for the function 


0 for 0<i<é6 


(3.1) = for 8<t<z, 


since the cosine series of ¢1(¢) converges to 0 at ¢=0. This yields the following 
result: 


A necessary and sufficient condition that (1.2) be strongly summable at t=x 
ts that for a fixed 6>0 


(8) 


In (¢) = <f f o(é)(u) cot cot R,(t, u)didu = o(n*) 
oS 


asn—>o, 


We now use this criterion to prove Theorem I. 
Schwarz’s inequality yields 


1 
16)" —ff o(#)? cot cot 4u R,(t, u)dtdu 
md J ostsuss 


. f f o(u)? cot cot 4u R,(t, u)didu. 
OS tSusi 


{ 
i 
~, 
Hence 


FOURIER SERIES 


(8) 


1°) f f cot cot u(t, 


< cot it cot R(t, a. 


f cot hu R,(t, u)du = (n+ 1 — v)* sin uf cot sin vu du 
0 1 0 
= r)>, (n + 1 — v)* sin vt; 
1 


and the relation 
6 n 
f (t)? cot (m + 1 — v)? sin vt dt = o(n*) 
0 1 


follows from Lebesgue’s theorem on (C, 1) summability applied to ¢(¢)?, using 
(1.4) and (2.7). This proves Theorem I. 


4. We shall now apply the Abel-Poisson summability method. From (2.1) 
for0<r<i 


> = f f cot cot ( > sin nt sin nu r) dtdu 


-[1 — 2r cos (u — #) + 1’) rh — 2r cos (u + #) + r?]—“dtdu. 
Putting f(t) =1, we get 
r 4r(1 


on 
f cos? 4¢ cos? — 2r cos (u — + 
0 0 


[1 — 2rcos (u + + r?|-ldtdu. 


The integrand will be denoted by P(t, u; r). Evidently 
(4.1) P(t, u;r) >0 for0 
If |f(t)| <1, 0<t<2z, this yields 


(4.2) Dor forO0 <r <1. 
1 1 


Similarly from (2.5) 


1940] P| 121 
But 
| 
| 
| 


OTTO SZASZ ; [July 


0 0 


1 


and (4.1) gives A r) $A r) whenever | p(t)| We first remark that 
1 
7) = of ) asrfi 


is a necessary and sufficient condition for strong summability; i.e., for 
>-3(s* —s)? to be o(n) as 
The necessity is obvious; the sufficiency follows from the inequality (valid 


for any p,20): 
1 —n 8 1 v 
1-—) 
n 


n 0 


If now ¢(t)—0 ast | 0, then, using the same majorant as in §2, we obtain 
still another proof of strong summability at points where ¢(t)—0. It is similar 
to Fejér’s second proof except that we use a simpler kernel. 

To prove Theorem I we observe that for the function ¢:(¢) of (3.1) evi- 
dently 


4.3) ==" f = o(—). 


This together with the symmetry of the integrand gives (as in §3): 


A necessary and sufficient condition for strong summability of (1.2) at t=x 
ts that for a fixed 6>0 


4r(1 — r? 1 
A,(¢;7r) = f f (t)o(u) P(t, u; r)didu = o(- -). 


Again using Schwarz’s inequality, we obtain 


(4.4) 
< [oor f u; dt. 


4r(1 — rf P(t, u; r)du = cot wf cot a >; sin nt sin nu 2) du 
0 0 1 


But 


= cot >> sin 
1 


| 122 
1 v 
n 


1940] FOURIER SERIES 


The right-hand side of (4.4) is 0(1—r)~—! since the cosine series of $(t)? is 
Poisson summable at t=0 under assumption (1.4). We have thus proved 
Theorem I once again. 

5. In this section we assume 


(5.1) f | o(x, u) | du = = o(t) 


From (4.3) 


f f ‘| | P(t, u; r)dtdu 


+o(—). 


= 
(5.2) 


The first term on the right is 


4r(1 — 72) 
= f f | | 
0 0 


cos? 4¢ cos? 4u didu 


[(1 — r)? + sin? 4(u — #)][(1 — r)? + 4r sin? + 
i-r pipe 
< f f | $(2)o(u) 


(5.3) 


{a —r)?+ (u — [a + (u+ 


assuming 0<5<2/2. Let 1—r<6, and decompose the range of integration 
into OSu+tS1-—r and 1—rsu+is25é. For the first part, using (5.1), we 
ge 


t 


Hence, for r f 1, 


1 
-[(1 — + r(u — 
Now, the last integral is 


— 1)? + r(u — = 2B,(r). 


| 
| 
| 
| 
| 
| 
| 


124 OTTO SZASZ 


Furthermore 


28 u 


25 
<(i-—,7)? | }(u)du 


28 28 26 
f u-| | du = + u~*d(u)du 
(1—r)/2 (i—r)/2 (1—r)/2 
25 
= O(1) + u~*(u)du. 


(1—r)/2 


Thus from (5.2), (5.5) and (5.6), asr 7 1 


1 28 
o( f 
(—r)/2 
Finally 


( ( 


1l—r)/2 1—r)/2 €(r) 
where we may assume 
— r) < e(r) = exp [— (log (1 — < 26. 


e(r) 
max u-'du = max u-!(u) [log e(r) — log — 
uSe(r) (1—r)/2 uSe(r) 


2 1 1/ 
max E (log ) 
uSe(r) i-r 
=) 
lo 
and 


28 1 
C.(r) = o( f = 0 (og 26 + (toe 
e(r) 


[July 
| and 
| 
26 e(r) 26 

u-*@(u)du = + = C,(r) + C2(r), 

Now 

| 


1940] FOURIER SERIES 


as rf 1. Summarizing, 


< 0(—) + o(——1 —) + o(—1 


(sit = o( — log —). 


1 1-—r 1— 


Putting r=1—1/n yields 


which proves Theorem II. 
Addendum (May 27, 1940): To complete the proof of the criterion in §3 
we remark that 
u 


fFo(u) cot R,(t, u)du} cot dt = o(n?). 


This follows easily from the fact that strong summability at a point is a local 
property of the function. A similar remark holds for the criterion of §4. To 
prove Theorems I and II we could also confine ourselves to the case 6=7. 

I have learned from Mathematical Reviews, vol. 1 (1940), p. 139, that 
T. Kawata (Proceedings of the Imperial Academy, Tokyo, vol. 15 (1939), pp. 
243-246) also gave a simpler proof of Theorem II. 


LITERATURE 


1. L. Fejér, Zur Summabilitatstheorie der Fourierschen und Laplaceschen Reihe, Proceedings 
of the Cambridge Philosophical Society, vol. 34 (1938), pp. 503-509. 

2. G. H. Hardy and J. E. Littlewood, The strong summability of Fourier series, Fundamenta 
Mathematicae, vol. 25 (1935), pp. 162-189. 

3. Otto Szész, Selected Topics in Function Theory of a Complex Variable, Brown Univer- 
sity Lectures, 1934-1935. 

4. Otto Sz4sz, On the Cesdro and Riesz means of Fourier series, Compositio Mathematica, 
vol. 7 (1939), pp. 112-122. : 

5. A. Zygmund, Trigonometrical Series, 1935. 


UNIVERSITY OF CINCINNATI, 
CINcINNATI, OHIO. 


125 
or 
1 
| 
} 
1% 
| 
| 
| 
‘ 
| } 


THEORY OF REDUCTION FOR ARITHMETICAL 
EQUIVALENCE 


BY 
HERMANN WEYL 


INTRODUCTION 


Minkowski’s Geometrie der Zahlen as it was published in 1896 led up 
to two fundamental inequalities concerning a symmetric convex body in rela- 
tionship to a lattice; in his notation 


(1) M*V <2" 
and 
(2) Si-++ SV 2". 


The second inequality, which generalizes the first, is a decisive step towards 
a theory of reduction of arbitrary gauge functions under arithmetical equiva- 
lence. In fact the problem of reduction for quadratic forms of m variables 
(ellipsoids) was the starting point of Minkowski’s investigations. But he must 
have found that the new instrument which he invented and of which he made 
so many beautiful applications in other directions was not quite adequate to 
the goal for which it had originally been devised. For 14 years later he came 
out with a paper on “Diskontinuititsbereich fiir arithmetische Aequivalenz” 
[1] which makes no use whatsoever of his own geometric methods. This was 
probably due to two difficulties: he failed to see a way of passing from pseudo- 
reduction to true reduction for an arbitrary convex body, and in the special 
case of ellipsoids he found the inequality of true reduction tied up with the 
selection of a finite number among the linear inequalities which characterize 
a reduced form. The latter knot was unraveled by a kind of topological argu- 
ment in a joint paper by L. Bieberbach and I. Schur [2] while K. Mahler 
in 1938 made an almost trivial remark which removed the first difficulty [3]. 
In a general overhauling of the geometry of numbers [4], to which the author 
was led by preparing an introductory talk for a seminar on the subject, he 
generalized (2) in such a way as to make the approach to that inequality 
more natural [5], rediscovered Mahler’s observation, substituted a simpler 
argument for that used by Bieberbach and Schur and finally extended 
Minkowski’s second theorem of finiteness. Without this extension certain 
primitive questions about the topological pattern of equivalent cells would 
be unanswerable. In a previous paper R. Remak had considerably shortened 
and sharpened Minkowski’s estimate for the coefficients 6;; which appéar in 


Presented to the Society, February 24, 1940; received by the editors February 16, 1940. 
126 


. 
| 
| 


ARITHMETICAL EQUIVALENCE 127 


the Jacobi transformation of a reduced quadratic form [6]. The author found 
that a considerable part of the theory of reduction could be carried through 
along the lines of Mahler’s approach for arbitrary convex bodies and that this 
more general procedure results in stronger rather than weaker estimates for 
the quantities on which the question of finiteness depends. 

The present paper sets forth the whole theory ab ovo, and hence is partly 
of a didactic nature; as far as possible it follows the geometric approach deal- 
ing with arbitrary convex bodies. In order to prevent it from becoming too 
dull reading, I have extended the theory to vectors and lattices and forms in 
which complex numbers or quaternions take the place of real numbers. Chap- 


ter I deals with the general theory, Chapter II with the special case of quad- 
ratic, Hermitian and “Hamiltonian” forms('). 


CHAPTER I. GENERAL THEORY OF REDUCTION 
A. THE REAL CASE 


1. Known facts about lattices. In the n-dimensional vector space E, 
whose elements are the n-uples r= (x1, - - - , x,) of real numbers we consider 
the lattice 2 of the vectors with integral components x;. The m unit vectors 
e, = (5, ---, 6%) form a basis of, or span, this lattice in the sense that the 
lattice vectors appear as sums ).,x;¢; with integral coefficients. Here 4 are 
the Kronecker 6’s. Any basis 8, =(s*, - - - , s®) of the lattice arises from the 
absolute basis e, by a unimodular transformation S =|\s*|| : 


3. = > 


The corresponding coordinates, x; and x/, r are linked by 
the equations(?) 


k 
x: = >> xfs; or briefly, x =x’S. 
k 


The coefficients s* are integers and their determinant is +1. The substitu- 
tions S with these properties form a group {S}, the modular group. Our view- 
point is that the vector space is endowed with the lattice, but that the choice 
of the lattice basis is arbitrary. 


(*) A brief and masterly treatment of the reduction of quadratic forms along purely arith- 
metical lines is to be found in a recent paper by C. L. Siegel, Abhandlungen aus dem mathe- 
matischen Seminar der Hansischen Universitit, vol. 13 (1939), pp. 209-239, of which I re- 
ceived a reprint on March 20, 1940. (The number of the journal itself has not yet reached 
Princeton.) But even against Siegel’s highly simplified arithmetical treatment, the geometrical 
approach retains the advantage of yielding sharper estimates. Siegel has a generalization of the 
second theorem of finiteness, different from ours, which leads to important applications in the 
domain of rational indefinite forms. (Added March 25, 1940.) 


(?) In preparation for a later generalization to quaternions we take good care to put factors 
in their proper order. 


4 
| 
| 
jel 
| 
1} 


128 HERMANN WEYL j [July 


Any k linearly independent vectors - - - , (QS <n) span a k-dimen- 
sional subspace 


Ex = E = [bi,--- , de]. 


If they are lattice vectors, then E is a lattice subspace E, consists of the vector 
zero only. 

A vector a not in E may be adjoined to E and then gives rise to the 
(k+1)-dimensional manifold E’ = [E, a] consisting of all sums 


(3) xa 


with rin EZ, x a number. If E is a lattice subspace and a a lattice vector, the 
adjunction is said to be primitive provided every lattice vector (3) in E’ has 
an integral coefficient x (and hence a lattice component r in E). 

Suppose 6;,---, d; are & linearly independent lattice vectors spanning 
the lattice subspace E= - - - , ds]. 


Lema 1. There exists a positive integer M such that every lattice vector in E 
ts of the form 


y1 Vk 


where the y’s are integers. 


There are two essentially different proofs of this fact, one resting on divisi- 
bility and determinants, the other on considerations of magnitude. The first 
proof runs as follows. We can select »—k among the unit vectors ¢:, - - - , en, 
say such that 


are linearly independent. The determinant of the components of (4) is non- 
zero; denote its absolute value by M. Writing down the equation 


(S) t= yidbi te + + +--+ + 


for any lattice vector r in terms of absolute components, one finds the coeffi- 
cients y and x’ to be fractions with the common denominator M. This applies 
in particular to the lattice vectors in E for which xf = --- =x,_,=0. 

The other proof compares £nE=%,, “the lattice in E,” with the coarser 
lattice 2} consisting of all integral combinations of bi, - - - , ds, 


(6) yidi + (y1,° °° , Ye integers). 


We maintain that there is only a finite number M of vectors in 2 which are 
incongruent modulo &}. For every vector r in E their exists a reduced one 


(7) =r(mod&), f= + 


| 
| 
| 
| 


1940] ARITHMETICAL EQUIVALENCE 


which satisfies the inequalities 


Using again the absolute components one readily derives from (8) upper 
bounds for the | x*| of any reduced vector r* = (x#*, - - - , x,.*). Hence if the x¥ 
are required to be integers, which is the case when rf and thus ¢* is a lattice 
vector, one finds oneself restricted to a finite number of possibilities. Our re- 
sult states that the additive Abelian group &/&? is of finite order M, and 
therefore every vector r of & satisfies the congruence Mr=0 (&), which was 
to be proved. 

The vectors bi, --- , d, form a lattice basis of E if &% coincides with &, 
that is to say, if every lattice vector in E is of the form (6). 

The vector 8, of any basis (8, --- , 8.) of 2 evidently is a primitive ad- 
junction to - - - , 8-1]. More generally, we have 


LEMMA 2. Suppose $1, --- , 8, constitute a basis of 2. The vector 
@ = +--+ + 


is a primitive adjunction to E=[8,--- , 8-1] if and only if ai,---, dn are 
integers and ay, +++ , d, are without common divisor. 


Proof. 1. If (ax, - - - , @,) have a common divisor d>1, then 


1 
(9) +--+ + Gn8n) 


evidently is a vector r’ in E’=[E, a] for which the x in (3) is 1/d and thus 
not an integer. 

2. If one denotes by x/ the components of r’ in (3) with respect to the 
basis 8;, one has 
(10) = , = 


Hence (10) must be integers for any lattice vector r’ in EZ’. However if 
- , @, are without common divisor one can ascertain integers ,1, 
satisfying the equation 


+--+ + Gal, = 1. 
The integrity of (10) then results in the integrity of 


x=xth+- 
itself. 


LEMMA 3. Suppose E’ is a given lattice subspace and } a lattice vector out- 
side E'. Then one can pass from E' to E= [E"', d| by a primitive adjunction 8. 


Proof. Let E be spanned by the k—1 linearly independent lattice vectors 


129 
| 
| 
a 
| 


130 HERMANN WEYL y [July 
and use the notations &%, with respect to the basis 
(81, --- , 8-1, 6) of E. We write each vector r of & in the form (3), 
(11) r= a+r’ in E’). 
If M is the order of the additive Abelian group &,/%2, we know that 
(12) Mx =y 
is an integer. Select a full system of residues 

=O 


of 2, modulo and denote by y =0, y™, ---, y““-» the corresponding 
numbers y as defined by (11), (12). The integers M, y,---, y““- havea 
greatest common divisor (G.C.D.) m*, namely a common divisor expressible 
as a linear combination 


IM + IMyO 4... 4 
with integral coefficients 1. By forming the corresponding combination 
we obtain a vector 8 of &, 
8 = (m*/M)d + 8’ (8’ in E’), 


such that for every r in &, the coefficient y is divisible by m*. This 8 evidently 
satisfies our lemma. 
Since m* is a divisor of M, M=mm*, we have 


(13) = (1/m)d + + 


m is a positive integer. Moreover one can assume 
(14) S4,---, <4. 


In the special case m = 1 one may simply take 8 = bd. 

We shall use our lemma only for the case when 41, - - - , 8-1 constitute a 
lattice basis of E’. Then the lemma makes possible, by induction with respect 
to k, the construction of a lattice basis for any given lattice subspace. 

All these simple facts about lattices are well known to the mathematician 
and the crystallographer. We had to restate them for later use and generaliza- 
tions. 

2. Gauge functions. Minkowski’s inequality. According to Minkowski, a 
real-valued continuous function f(r) =f(x, - - - , x.) in vector space is said 
to be a gauge function under the following three conditions: 

(i) >0, except for x= --- =x,=0; 
(ii) ---, tx,)= |¢| -f(%1, ++, for any real factor ?¢; 


i 
« 


1940] ARITHMETICAL EQUIVALENCE 131 


One may use this function to endow the n-dimensional affine point space with 
a metric by ascribing the distance f(pp’) to any two points p, p’. The gauge 
body & defined by f(r) <1 is an open convex bounded set surrounding the 
origin r=0. (Boundedness follows from the fact that f(x, - - - , x,) has a posi- 
tive minimum on the sphere xj+ - -- +x3=1.) & has a Jordan volume V. 
Equation (13), together with (14) and m21, results in the inequality 


(15) f(8) f(d) + + --- + 


If one makes the distinction m=1 or m22 one finds that f(8) cannot exceed 
both numbers 


f(d), 3f(d) + +--+ + 
Therefore we may state this : 


SUPPLEMENT TO LEMMA 3. The vector 8 may be chosen so that (15) holds, 
or even so that 


(16) S max {f(d), 3/(d) + + 


Minkowski determines a sequence of lattice vectors d;, - - - , d, and lattice 
subspaces Eo, Ei,--- , E, starting with the zero-space Ey by the following 
induction with respect to k. 

Among all lattice vectors a outside E;_1, one chooses one, d,, for which 
f(a) takes on the least possible value, so that f(a) >f(d,) for every a outside 
E,-1. The space E; arises from E,_1 by the adjunction of E, = [ dy. 

We put f(d,) = M;. Evidently 


Consider the continuous series of homothetic solids 


Rg): ft) <q 


increasing with the positive parameter g. Our M; can be described thus: 
R(q) contains less than k linearly independent lattice vectors as long as 
q = Mi, but at least k such vectors if g> M;. Hence M,, - - - , M, are uniquely 
determined. About these consecutive minima Minkowski proved the funda- 
mental inequality: 


THEOREM 1. 
(2) 


For later purposes we repeat this proposition in the following slightly 
modified form: Suppose Mj, -- - , M,! are given positive numbers such that 
the number of linearly independent lattice vectors r for which f(r)< Mé is 
less than k. Then 


(17) Mi 2". 


} 
| 
{ 
| 
i 
| 
ia 
in) 


132 HERMANN WEYL - [July 


While M;,---, M, are uniquely determined, there may be a certain 
amount of free play in the choice of ;, - - - , d,. The most one can say about 
it in general terms is this: 


THEOREM 2. Jf d/,---, d2 are a second set of lattice vectors determined 
just like 01, Dn, and if, for a certain k, Mi< Mi41, then d{,---, d¢ are 
linear combinations of - , dx only. 


Proof. Suppose one of the vectors b/, - - - , d¢, say d/, is not a linear com- 
bination of Then d;,---, df are linearly independent, and 
hence not all the k+1 numbers 


f(d1) = Mi,--- , f(de) = Mi, = Mi 


can be less than This contradicts the assumption M; < Mi41. 

The problem of reduction consists in constructing a basis for the lattice 2 
in terms of the given gauge function f. The vectors };,---, d, do not yet 
solve the problem because in general they do not span the whole lattice &. 
Our next task will be to pass from this pseudo-reduction to true reduction, a 
step well prepared by the considerations of §1. 

3. Reduction. The only modification needed in the definition of }, is the 
insertion at its proper place of the word “primitive.” The new inductive defi- 
nition of lattice vectors 8, - - - , 8, and lattice subspaces Eo, Fi, --- , E, runs 
as follows: 

Among all primitive adjunctions a to Ex-1, we choose one, d,, for which f(a) 
assumes the least possible value, so that 


f(a) = f(8x) 
for every primitive adjunction a to Ex-1. Moreover 
= [Ex-1, 8]. 


Lemma 3 guarantees the existence of primitive adjunctions a to E,_1. 
We realize by induction that #,,---, & is a lattice basis for E,, hence 
$:,---, 8, for the whole space. We put f(8)=L;. Taking Lemma 2 into 
account, we can give our definition of a reduced basis 4, - - - , 8, the follow- 
ing turn: 

An n-uple of integers (x1, --- , xn) is said to belong to X; if x,---+,%n . 
are without common divisor. The basis 8, - ++ , 8, of is reduced with respect 
to f, if for every R=1,---, mand every (x1,+-+, Xn) of Xx the inequality 


(18) +--+ + %n8n) 2 


holds {7|. Our procedure has led up to this result: 


THEOREM 3. For every gauge function f there exists a reduced basis ®,---, 
8, of the lattice. 


fi 

| 

| 

a 
~ 


1940] ARITHMETICAL EQUIVALENCE 


Relation (18) implies 
(8x41) = f(8x) 


or 
(19) sk. 


The following proposition ties up pseudo-reduction with the reduction 
just defined [8]: 


THEOREM 4 (Mahler’s theorem). One has 
(20) Ly < 
where 6; 1s a constant independent of the gauge function f. 


An immediate corollary derived from it by Minkowski’s inequality’ (2) is 


THEOREM 5. The relation 
(21) 
holds with p,=2"- 0:02 - - On. 


Proof. After we have ascertained $1, - - - , 8-1 we determine a primitive 
adjunction 8 to E’= [8:, sey $1] by the construction of Lemma 3, choos- 
ing } in this particular fashion: One of the & linearly independent vectors 
di, -- +, dg occurring in Minkowski’s construction, say ;, lies outside E’. 
We silie b=); and then find a primitive adjunction 8 to E’ such that 
[Z’, 8] =[E’, By the supplement to Lemma 3 one will! have 


f(8) fd) + + f(y}. 


Since f(d) is one of the numbers M,, - - - , M; and hence is less than or equal 
to M;, and since by definition L; Sf(8), we find 


S Me + 
which under the assumption of the inequalities 
‘ Li , S 
leads on to 
Ty S 
with 
(22) =14+3(0i +--+ +61). 


Hence Theorem 4 is proved inductively, and, by the recursive relations (22) 
or 


1; = 14401 +--+ + +O) = + = 


133 
| 
| 
if 
ik 
BY 
batt 


134 HERMANN WEYL 


we find the following explicit expressions for 0, and y,: 


Suppose fo, pi, , Pa are given numbers satisfying the following condi- 
tions: 


(23) S hn. 


A basis 8), - - - , 8, of 2 is said to have the property B(pi, - - - , pa) if the in- 
equality 


holds whenever (x, ---, Xn) is an m-uple in X; and k one of the indices 
1, ---,. By exploiting our method to the full we arrive at the following [9] 
generalization of Theorem 4: 


THEOREM 6. If the lattice basis 8 has the property B(pi, -- - , Pn), then the 
values f(8 satisfy the inequalities 


(24) Li =—L;j (for k > i) 


and 
(25) - Lk S Mi 
with a constant 0,(p) depending on pi, --- , px but not on f. 


Relation (24) is a consequence of the fact that (6{, - - - , 6%) is an n-uple 
in X; if k>z. Otherwise the proof follows the same road as before. (22) gives 
place to this recursive equation: 


Ox(p)/pPe = 1+ +--+ + 


which in the same manner readily leads to 
k—-1 
Ox(p) = pe (1 + 49). 
t=—1 
One sees that 0;,(p)/px increases with k, and therefore (23) implies 
(26) 1 = 60(p) S--- 


One can repeat our whole argument after replacing (15) by the sharper 
and slightly more complex inequality (16). One then obtains this 


SUPPLEMENT TO THEOREMS 4-6. One may choose 


27) = (9) = pe + 400, 


[July 


1940] ARITHMETICAL EQUIVALENCE 


or, with a slight improvement, 
6, = 1, = ($)** (fork2 (g) 


(28) 
= pr, OP) = Pe —— IT (1 + 400 (for k = 2). 
i=2 
Shifting the accent, we call a gauge function f(x, ---, xn) reduced if it 
satisfies the inequalities 


f(x, %n) > , bn) 


for any vector (x1, +--+, Xn) in X,and k=1,---,m. This means that the unit 
vectors e,=(6;,---, 64) form a reduced lattice basis with respect to f. The 
inequalities (20) then hold for L,=f(e,). If f(r) is any gauge function and 
$1, °° +, 8, a reduced lattice basis with respect to f, we may set 


Then f*(x1, - - - , X,) is a reduced gauge function, and we see that any gauge 
function f can be carried over into a reduced one by a unimodular transforma- 
tion S of its variables. We shall adopt this terminology in Chapter II while at 
present we stick to talking in terms of reduced bases rather than gauge func- 
tions. 

4. The question of uniqueness. Denote by X} the set X; after excluding 


the two 7-uples 
» Xn) =+ 


The lattice basis 8, ---, 8 is said to be properly reduced when for every 
k=1,---,mand for every (x1, --- ,x,) in X# the inequality (18) holds with 
the > sign. 

The 2” diagonal transformations of the modular group, 


J: = + 8 = +8, 


(all possible combinations of signs admitted) form a finite Abelian subgroup 
{J} of order 2*. Its generators are the involutions Ji, - - - , J, which change 
one sign at a time: 


Ji: 8¢ = —% and 8/ =8; for all 7¥k. 


Clearly the J carry a reduced basis (81, - - - , 8,) intoa reduced one. The first 
result concerning the question of uniqueness is that this exhausts the possi- 
bilities, provided (81, - - - , 8.) is properly reduced [10]. Of two lattice bases 
(8:1,---, 8,) and (8/,---, 8), the first is called lower than the second 
provided the first nonvanishing difference 


135 
1) 
i 
i 


136 HERMANN WEYL ; [July 


happens to be positive (which includes the case for which they are all zero). 


THEOREM 7. Let (8{,---, 82) be any lattice basis and (8, --- , 8n) bea 
properly reduced lattice basis. In these circumstances (81, - - - , 8n) is lower than 
(8/,---,8n), and the equations 


imply 
= + 8¢ = + &. 
If (81, +++, 82) is reduced and (81, - , is properly reduced, then 
= + , = +B. 


Proof. Under the hypothesis that (81, - -- , 8,) is properly reduced, we 
have to show that 


(29) = + , = + 
imply f(8 ) =>f(8.), and even ) >f(8.) unless 8f = +8. 
Because of (29), 8¢ is a primitive adjunction to 
and hence 
(30) = 


As (81, - - - , 8,) is properly reduced, the equality sign in (30) will hold only 
if 8f = +8. 

Suppose 87, - - - , 8, is reduced and (29) holds. Since 8; is a primitive ad- 
junction to [8/,---, 8¢1], we must have f(8) =f(8 ) in addition to (30), 
and hence f(8¢ ) =f(8,), an equation which we have just found impossible un- 
less 8¢ = +8. This establishes the full content of our theorem. 

Much less can be said if the reduced basis (8, - - - , 8,) is not properly 
reduced. 


THEOREM 8. If 
are two reduced bases, then 
Ly = f(r), Le = 
satisfy the inequalities 
(31) OL, 2 Lk, = Ly. 


(This proposition indicates how far the uniqueness of the M; survives 
for the L,.) 


| 


1940) ARITHMETICAL EQUIVALENCE 137 


Proof. Because there are k linearly independent lattice vectors 
t=$1,---, 8 for which f(r) Li cannot be smaller than Hence 


(32) M.S Lx, Li S 
Li S 
Elimination of M;, leads to the two inequalities (31). 
The case when (8, - - , is reduced while the basis - - - , 8x) has 
the property B(pi, - - - , Pn) will also be needed later. The & linearly independ- 


ent vectors $;, - - - , $¢-1, 8¢ impart values to f which are less than or equal to 


++, Pele, Lf 


respectively. Hence 


S , Le 6:(p)- Mx. 


Substituting these inequalities for the second line of (32) and again eliminat- 
ing M;, we find: 


THEOREM 8,. For a reduced basis (81, --- , 8n) and a basis (81,--- , Sn) 
of the property 
Bhi, bn) 


the values 
Li = f(8), 
satisfy the inequalities 


With the same effort one could have established similar relations for two 
bases of the properties B(p:,---, p,) and B(py,---, ) respectively. The 
present generality, however, is sufficient for our purposes. 


THEOREM 9,. If, for a certain k=1,---,n—1, 
(34) < Liss, 


then 8{,--- , 8 are linear combinations of the vectors 8, - - - , 8, only and thus 
arise from them by a unimodular transformation of degree k. 


Proof. Suppose that in one of the vectors 8), --- , 8, say 


= 5161 +--+ + 


not all the components sj, (j=k+1, - - - , m), vanish. Then 4, - - - , 8, 8/ are 
linearly independent and hence the maximum of the k+1 numbers 


Ly = f(s:), Le = f(x), Li = f(8i) 


must be greater than or equal to M;4:. If on the contrary 


| 
| 
"4 
tat 


138 HERMANN WEYL ' [July 


(35) 


are all less than Miy4:, then the 8/,---, 8/ are linear combinations of 
only. Now 


Li 0,(p)-Li 
and owing to 
all our requirements concerning (35) can be met by the one condition 
< 
which in its turn is a consequence of 
Ox(p)-Le < Legs /On41 


because S 
In the particular case where (8, - - - , 8, ) is likewise reduced (pi= - - - 
=p, =1), we have the following close parallel to Theorem 2: 


THEOREM 9. Let --- , 8, and 8{,---,8 beiwo reduced bases of %, and 
(8%) =Ly. Suppose that moreover, for some kSn—1, 


< Leys. 
Then the first k vectors 81 , are linear combinations of 8, - , 8% only. 


B. THE IMAGINARY AND QUATERNION CASES 


5. Integers and Minkowski’s inequality in the complex field. Complex 
numbers  =xo+ix; have two real components Xo, x1. We denote the conjugate 
by &=x9—ix,. Trace and norm: 


are real and the coefficients of a quadratic equation satisfied by é: 


(36) & — &-tré+ NE=0. 


Let w be a non-real number. 1, w span a lattice 7 in the Gaussian plane con- 
sisting of all numbers 


(37) &= yo+ yw (yo, ¥1 integers) . 


If is closed with respect to multiplication and the operation —£, then F 
is a self-conjugate ring, and we agree to call the elements of F integers. Owing 
to the choice of 1 as an element of the lattice basis 1, w, the only real integers 
(with y: =0) are the common rational integers. Trace and norm of an integer & 
are rational integers. Hence the quadratic equation (36) for § =w shows that w 


, 
|| Di, , De; 
| 


1940] ARITHMETICAL EQUIVALENCE 139 


is of the form 3(c+id'/?) where c and d are rational integers and either 
c=0 (2), d=0 (4), or c=1 (2), d=1 (4). 


The lattice 7 is rectangular in the first, rhombic in the second case. The 
density of the lattice 7, that is to say, the area of its fundamental parallelo- 
gram spanned by 1, w, is $d"/?. 

The numbers of the form (37) with rational coefficients yo, y; form the 
embedding field Fo. Indeed if £0 is in Fp so is 


= &/NE. 


Fo is the quadratic field over the rational field determined by (—d)'/?. The 
Xo, x1 and yo, y1, formula (37), are always spoken of as the x- and y-components 
of a complex number £ = x9+7%1. 

We ask for the least radius r such that the circles of radius r around all 
integers cover the whole &-plane. One readily \».’s in the rectangular case, 


r= 4d)”, 
and in the rhombic case 


i+d 
If £ is any complex number, one can always ascertain an integer a such that 
N(é — a) Sr’. 


Another constant which will crop up later is the least norm e? of an integer 
a0 which is not a unit (i.e. for which 1/a is no integer); e is either 21/2, 31/2 
or 2. 

We operate in a vector space E, of 2m real dimensions whose vectors 
r=(&,---,&) have arbitrary complex coordinates £;. The lattice 2 consists 
of all vectors whose coordinates £; are integers (elements of 7). The notion 
of a lattice basis needs no explanation. The modular group {5S} consists of 
all unimodular transformations S, 


&= tle: 
k 


with integral coefficients ¢} whose determinant is a unit e. 
A gauge function is a real-valued continuous function f(&, --- , &) with 
the following three properties: 
(i) f(&, £,)>0 except for (&, = (0, 0); 


We introduce real coordinates by and use them in defin- 


| 
| 
| 
| 


140 HERMANN WEYL [July 


ing the volumes of solids in our space. In particular V denotes the volume of 
the gauge body 


We carry out Minkowski’s construction according to the same recipe as in 
the real case and thus determine 1 lattice vectors };, -- - , d, and consecu- 
tive minima M;,=f(d,). Our first concern is the analogue of Minkowski’s in- 
equality: 

THEOREM 1*. 

2 2 1/2." 
(38) M,--- S )’. 

We resort to Minkowski’s original inequality in the form (17). But under 
the present circumstances we deal with 2” real coordinates xxo, xx1 and witha 
lattice which is the direct product of m two-dimensional lattices of density 
3d'/? rather than 1. Hence the right side in (17) is to be replaced by 


n= (2d1/?) 


The only lattice vectors r for which f(r)< MM; are linear combinations of 
di, - - - , Dk-1 with complex coefficients. Hence there are at most 2(k—1) vec- 
tors satisfying this inequality which are linearly independent in the real sense. 
Consequently we may take 


= Mx = Mk, 
and in this way the inequality 


results in (38). 

6. The same for quaternions. A quaternion ~ has four real components 
(xo, x1, X2, x3). The conjugate is £=(xo, —x1, —x2, —x3). The quaternions 
(x, 0, 0, 0) can be identified with the real numbers x. Both trace and norm: 


tre=E+E=2m, 20+ + mt 
are such real numbers. Every quaternion £0 has its reciprocal 


(39) = ¢/NE; 


but since multiplication is noncommutative we have to do with a division 
algebra rather than a fieid. Each quaternion é satisfies the quadratic equation 
(36) with real coefficients. 

Any lattice ¥ in the four-dimensional space with the real coordinates 
Xo, X1, x2, %3 which is spanned by four linearly independent quaternions in- 
cluding 1, 


, 


1940] ARITHMETICAL EQUIVALENCE 

(40) wo=1, wi, we, ws, 

may serve to define the integral quaternions as those of the form 
(41) E = yowo + yiwi + Yowe + Yaws 


with ordinary integral coefficients y, provided 7 is closed with respect to 
multiplication and the operation ££. Then trace and norm of a quaternion 
integer are rational integers. As (39) shows, the quaternions (41) with ra- 
tional y form the embedding field 7. We denote by 3d the density of the lat- 
tice 7, and maintain that d is a rational integer. Although this fact is of little 
importance to us I shall briefly indicate its proof. 

With (41) we form 


. 2 2 2 2 

(42) NE = (= + 41 + + 45). 
t,k=0 

The coefficients 
(43) a;; and 2a, fori =k 
are rational integers. According to the transformation theory of quadratic 
forms the discriminant of (42) is (4d)? and hence, because of (43), d? isa 
rational integer. On the other side let us study the field F) and any basis 


Wo = 1, wi, we, ws of the field. Starting with (40) we may first subtract from a: 
and w, half their traces and thus provide for the conditions 


= — — We. 


Then wiwe+wew; is the trace of wwe and hence a real rational number 2c. Re- 
placing we by w2-+-cw1, one gets 


WoW] = — W1W2. 


@1W2 is in the field. Choosing it as w; the form (42) becomes 


yo + ay, + bys + abys 


which shows that its discriminant is the square of a rational number. This 
property persists for any basis of 7». Hence d? is the square of a rational 
number d, and, as d? is integral, so is d itself [11]. 

r and e have the same significance as before. 

The vectors r= (&1, - - - , £.) which we now consider have arbitrary qua- 
ternions &;, for their components, 


= (X20, Vk1y 
= VroWo + + Vr2We + 


The definition. of lattice vectors remains unchanged. The modular group con- 


| 
| 
| 
| 
| 
4 
| 


142 HERMANN WEYL . [July 


sists of all pairs of mutually inverse transformations 
k k 

k k 


with integral coefficients of, r}. (This modification of the definition is forced 
upon us because a quaternion matrix ||o4|| has no determinant.) One has to 
observe carefully the position of the factors. Our convention is that the sub- 
space spanned by & linearly independent vectors 6;, - - - , d% consists of the 
vectors 71d1+ --- -++7:5; with the coefficients y in front of the vectors. 

The description of a gauge function by the three properties (i), (ii), (iii) 
stays unaltered, with the factor 7 in front of the variables £1, - - - , &, in (ii). 
Minkowski’s inequality assumes the form 


4 


My V 2(30)", 
which we put down as 


THEOREM 1**. 
(44) S$(2d"). 

7. Reduction. What remains will be done simultaneousl: for the imagi- 
nary and the quaternion cases in such language as applies literally to the more 
complex of the two. We have to check Lemmas 1-3 of §1 as to their validity 
under the new circumstances. 

3 Both proofs of Lemma 1 go through with the following precautions. (5) is 
to be written down in terms of the 4” integral y components of the vectors 
and coefficients coucerned, and the positive ratior al integer M is the absolute 
Vv value of the determinant of the linear equations with 4n unknowns thus ob- 


eee 


tained. The incqGuaiiues (6) ior a reduced vector (7), 
must be replaced by 


IA 


} | In order to secure the validity of Lemmas 2 and 3 an essentially new as- 
sumption has to be made: 


HypotueEsis P. Every left or right ideal in the ring ‘f is a principal ideal. 


-\ As far as left ideals are concerned it requires: Any integers 
(a1,--- , an) ¥ (0,--- , 0) 
‘ \ have a left common divisor 6, 
a, = (31, - + , Ba integers), 


{ 
\ : 
| 
{ 
\ 
\ 
\ 
; 
2 
i 


1940] ARITHMETICAL EQUIVALENCE 


which can be written as a linear combination 
(45) + + andr 


with integral coefficients \;. This divisor 6, which up to a right unit factor is 
uniquely determined, is called the left G.C.D. of a, - -,- , aa. (The integers 
represented by (45) if the A; range independently over all integers coincide 
with the values of 6-y for all possible integral values of yu. It is sufficient to 
make the requirement for two integers ai, a.) 

Lemma 2, in which the last words “without common divisor” must be 
changed into “without left commen divisor,” is true under the hypothesis P 
for left ideals (P:). Change the Roman into Greek letters and define d, or 

rather 6, as the left G.C.D. of az, ---,a,. The alternative 1 occurs if 6 is 
not a unit, the alternative 2 if az, - -- , a, are without left common divisor 
(which means, of course, that they have no left common divisors exceot 
units). 

One has merely to glance through the proof of Lemma 3 in order to realize 
that it depends on the hypothesis P for right ideals. We obtain the primitive 
adjunction in the form 


8 = (1/u)d + t + 


where yu is a nonzero integer and the r satisfy the inequalities 


Nr S r?,---, S 
If » is a unit one may take 
As Nu 21 for any integer 1 #0, the inequality (15) is turned over into 
S f(d) + r{ +--+ + 


while in (16) the smallest norm e?>1 of integers makes its appearance: 


f(8) max { f(b), (1/e)f(d) + + + 


Incidentally hypotheses P; and P, are fulfilled if r<1. For then Euclid’s 
algorithm for the G.C.D. goes through. In the complex field this happens for 
the rectangular lattices ¥ with d=4 (Gaussian field) and d=8, and for the 
rhombic lattices ¥ with d=3, 7, 11. The most important example for qua- 
ternions is the classical case first treated by A. Hui witz [12]:he declares a 
quaternion (xo, x1, X2, 3) to be integral when 2x», 2x1, 2x2, 2x3 are rational 
integers either congruent to (0, 0, 0, 0) or to (1, 1, 1, 1) modulo 2. One realizes 
at once that here r <1; the exact value is r= 1/31/2. 

The whole theory of reduction of §§3 and 4 wii’ now go through, practi- 
cally without alterations. We indicate the few changes to be made. X; is the 
set of all m-uples (&:, - - - , for which &, - - - , are integral and &, --- ,& 


143 
! 
| 
1 
4 


\ 


\ 
144 HERMANN WEYL. [July 


without Jeft common divisor. X* arises fre hs by excluding the following 
n-uples: 


5.) \ (¢ a unit). 
{ J} consists of the diagonal transformations \ 
J: Of = or = 58 = 
where «1, --- , €, are units. This group is the direct product of x facties each 


of which is isomorphic with the group of units. The most essential >oint 
concerns the values of the constants 0;, 0:(p) and fn. 

Instead of the recursive formula (22) we get #,=1+7r(@it---- + O14). 
leading to 


= 


Sin 
6x(p) = pe (1 + rps). 
THEOREM 5**. The inequality 


*? 


Ban 


lA 


holds, where x=1, 2, 4 characterize the real, imaginary and quaternion uses 72 
spectively and 


2 1/2 n—ly)n 
wn = {2d -(1+9r)” }". 
(In the real case d=4, r=}.) 
The same trick as used before, compare formulas (27) and (28), allows us 
to improve to some extent these values of 6;, 0.(p) and pn. 
CHAPTER II. REDUCTION OF QUADRATIC, HERMITIAN 
AND HAMILTONIAN FORMS 
8. Jacobi transformation. A quadratic form 


of m variables (x1, - - - , x.) =z is characterized by its real symmetric coeffi- 

cients g:;=gj;; and may thus be denoted by f= { giz}. All quadratic forms con- 

ee stitute a linear space R of N=}n(n+1) dimensions. In the imaginary and 

q the quaternion cases the analogues are the Hermitian and“ Hamiltonian” forms 
respectively, 


(47) Jey f,) = 


\ 
» 
| 
t,2 


1940} ARITHMETICAL EQUIVALENCE 145 


whose complex or quaternion coefficients satisfy the symmetry condition 
(48) Yn = 

The conjugate of a product is the product of the conj ugates in inverted order. 
This rule at once shows that the value of f is real, f=f. In the quaternion 
case one has to w2tch out for the order of the factors on the right side of 
(47). The subsiiiution x,;-tx; multiplies the quadratic form (46) with 
?= |¢| 2= Nt, while &;— £; changes (47) into 7f7, or since f is real, into 


Nerf. 


The diagon..] coefficients y;; are real while the skew coefficients y;; on one 
side of the diagonal, 7<7, may be chosen arbitrarily and then determine the 
coefficients y ;; on the other side by (48). Hence the quadratic, Hermitian and 
Hamiltonian forms f constitute linear spaces of 


or of 
N = 3n(n + 1), 2, n(2n — 1) 


dimensions respectively. Tue ferm f is said to be positive if f(r) >0 except for 
+=0. According to our remarks above. f'? may then serve as gauge function 
» the rea}, i: aaginary or quaternion vector spaces. 

ix formatw oniquely determined linear transformation of 
recursive character of a positive quadratic 10:m into a square sum. It is noth- 
ing else than the method of “completing the square” which, probably some 
4000 years ago, was invented for the solution of quadratic equations. It no 
less applies to Hermitian and Hamiltonian forms, though in the latter case 
we have to bear in mind that there are no determinants. Thus we had better 
disregard this formal tool altogether. The discriminant of the form will be 
defined by recursion in the course of our construction. Its general explicit 
expression in terms of the coefficients y;; is a task about which we need not 
bother here [13]. I now give the description of the process for positive 
Hamiltonian forms f. 

If f is positive, then yu is real and greater than 0, yu =q:. We form 


Yn 
V1 Y11 
which implies 


Yu 


and find 


4 


| 
| 
| 
| 
| 


146 HERMANN WEYL 


(49) En) = + f* (ke, En) 


where the remainder f* depends on the variables &, - - - , &, only. Incidentally 
its coefficients are given by 


Yu 


(50) Vii = 


f* is positive; for if &,---,&, are any given values we may determine & by 
the equation 


Y Yn 
=0 
711 Yu 
and then 


except for &= --- =, =0. Iteration of the splitting (49), therefore, leads to 
an expression 


(51) + fal? 


(Jacobi’s transform) where the g are positive numbers and ¢; linear forms of 
the recursive type 


(i>i) 


The product qi - - - g.=D=D, is called the discriminant of f. 
Break the sum (51) into two parts according to 


= (qi| 1 |? + + | |?) + (qe| + + 


and substitute r= e,. The value of the whole form is yx: while the value of the 
second summand is gx. Hence 


(53) Qe S 
(54) D °° Ynn- 
The Jacobi transformation of the positive form 
= - 0, - , 0) 


of k variables is obtained from (51) by setting &41= --- =&=0. Conse- 
quently its discriminant is Dk.=qi - - - gq and thus 


qe = Di/Di-r = 1). 


The first step (49) goes through under the sole assumption yu =q:>0. If, 
in carrying the process further for a given form f, we find g.>0,---,qn>0 


[July 
. 


1940] ARITHMETICAL EQUIVALENCE 147 


at the following steps, then the formula (51) itself reveals that f is positive. 
By (50) the inequality g.>0 amounts to 


| yn |? = | |? < Yu1'Y22-- 


More generally we must have 


| vig |? < (i j) 


for any positive form f. 
Next we compute the volume V of the 4n-dimensional ellipsoid f(r) <1. 
Denote by w, the volume of the sphere 


---+ a, <1 


in the n-dimensional real vector space. When in the recursive substitution 
(52) we replace each of the quaternions £ and { by its 4 real x-components, 
we again obtain a recursive substitution, this time in 4 variables, whose co- 
efficient matrix has 1’s along the principal diagonal and hence is of determi- 
nant 1. Thus the volume V is the same as that of the Jacobi transform 
> £:|?<1 or in real x-components 


<1. 


Consequently 


(S5) (qi? Qn)! ++? (2n)!° D? 


1 1 


In the real and the imaginary case one finds 


Wn 
(56) 


instead. Incidentally these formulas prove that, although our recursive defini- 
tion refers to a definite arrangement, the discriminant of f is not changed by 
arranging the variables - - - , in a different order. 

From here on we limit ourselves to real quadratic forms, because the ad- 
justments to the two other cases are sufficiently trivial; only an occasional 
glance will be cast upon them. 

9. Some simple topological considerations. Within the N-dimensional lin- 
ear space R of all quadratic forms f= { gis} the positive ones form a convex 
subset G which is a cone with the origin f=0 as vertex. The relative clause 
means that dilatation, f—f, at any positive rate ¢ carries G into itself. G is 
an open set. Indeed the quantities emerging at the first step of Jacobi’s trans- 
formation, 


| 

| 
| 
| 
| 
D n! D 
i 


148 HERMANN WEYL ; [July 


* 8i1813 
811 


(i,j = 2,---+,m), 


all depend continuously on f. [We now use corresponding Roman instead of 
Greek letters throughout, so that the transformation (52) reads 


(52’) 24+ xx. ] 


(i>?) 
Hence qi, - - gn} (7 depend continuously on f at a given point f° of G, 
and all forms f in a certain neighborhood U of f° will satisfy the conditions 
0 9 
qi = 391,°°* Gn = 39n 


and thus be positive. 
Jacobi’s transformation shows quite explicitly that for a given positive 
form f and a given number A the inequality f(r) <A entails upper bounds 


for the |x;| of r=(x1,--~-, xn). In fact, one first obtains upper bounds for 
|zi|,---, |s,| and then, going in backward direction, from the relations 
(52’) upper bounds for |xn|, |xn-1|,---, ||. One can make this estimate 


uniform throughout a sufficiently small neighborhood of a given form. Hence 
this 

LemMA 4. Let A(f) be a real function depending on a variable point f in G 
and continuous at the given point f°. We can fix a neighborhood U of f° such that 
nearly every lattice vector t= (x1, -- +, Xn) has the property of satisfying the in- 
equality 

f(t) > A(f) 

for all f in U. 


(“Nearly every” means that only a finite number lack the property in 
question.) 
Proof. We fix the neighborhood U so that 


If xr is a vector such that there is an f in U for which f(r) SA(f), then (51) 
yields upper bounds for | e| which are universal in that they do not depend 
on the specific f in U, and (52’) yields universal bounds for |x,|,---, |x|. 

From now on up to the end of §12, f without or with accent or index al- 
ways indicates a point of G. All topological notions are to be interpreted rela- 
tive to G; e.g., a subset of G is said to be open or closed whenever it is open 


or closed relative to G. 
Before going on we specialize some of our previous definitions concerning 


gauge functions to gauge functions of the type f'/? now under consideration. 
A positive quadratic form f is said to be reduced if it satisfies the inequality 


> 

* 

; 

- 


ARITHMETICAL EQUIVALENCE 


f(x, Sa) = 
for any vector (x1, --- ,%n) in X,and fork=1,---,n. This implies 
(0 <)gur S S S 


Two forms f, f’ are called equivalent and counted in the same class if one 
proceeds from the other by a substitution 


k 
x= XE 
k 


of the modular group. Every point f in G is equivalent to a reduced one. 
To each index k and vector r=(x1,---, X,) in X; there correspondsa 
linear form of the coordinates g;; in R, 


S(t) — gee = Do — = 
i,7 i,7 
which we denote by a; (r); its coefficients are 
= 
The relations for the variable point { gis}, 
DX = 0, 2 0, >0 


are referred to as the equation, the inequality and the strict inequality ax(r) 
respectively. Except for r= + &, i.e., for every vector rin X;*, the inequality 
and equation a;(r) define a half-space and its bounding (N—1)-dimensional 
plane in R. Now f is properly reduced provided the strict inequality a:x(r) 
is satisfied for every rin X* and every k. Examples of properly reduced forms 
are ready at hand; the simplest are the diagonal forms 


- + with 0 < gi < ge < gn. 


The reduced points form a closed convex subset Z of G which again is a 
cone and will be called the (basic) cell. A properly reduced f is said to belong 
to the core of Z. An inner point of Z belongs to its core. Each unimodular 
substitution S carries Z into an equivalent cell Zs. The substitutions of the 
subgroup {J} leave Z unchanged, but if S is not in {J} then no point of 
the core of Z can be in Zs (Theorem 7). Hence the equivalent cells Zs cover G 
without gaps and overlappings; two different cells have none but boundary 
points in common. Here two substitutions like S and JS which are left equiva- 
lent modulo {J} are to be identified because they have the same effect on Z. 


Our aim is first to study the individual cell.Z and then the whole pattern of 
the division of G into equivalent cells. 
We start with the observation that a point f° belonging to the core of Z is 


1940] ee 149 

| 

| 

| 

| 

| 


150 HERMANN WEYL [July 


an inner point of Z. Indeed accordiiiz to Lemma 4, nearly every lattice vector 
t satisfies the inequalities f(r) >gi. for k=1, - - -, mand for all forms f ina cer- 
tain neighborhood 7 of f°. Therefore among the infinitely many inequalities 


(57) (r in k= 1, n) 


there are only a finite number, say a’, a’’, - - - , which are not a priori sure 
to hold throughout U. But if the strict inequalities a’, a’’,--- hold for f° 
then they hold also in a sufficiently small neighborhood U’ of f°; and the 
neighborhood UnU’ of f° lies in Z. 

Denote by 7; the subset of X; to which x belongs if there are reduced 
forms f satisfying the equation f(r) =gix. The two vectors +¢ belong to T;, 
and again 7 designates what is left of 7; after these two vectors have been 
removed. The planes 2;(r) =0 corresponding to the r in T¥* graze the cell Z. 
Our last result asserts that every boundary point of Z lies in one of these 
grazing planes 


(S8) - an(t) = 0 (x in Ti, k=1,---, m). 


Hence from a general topological principle which we shall presently prove for 
our special situation there follows 


THEOREM 10. In the definition of Z as the se: consisting of all points f of G 
which satisfy the ixeavalities 


(59) a(t) for every tin X, and k = 1,---,n, 
the vector sei A, mav be replaced by Ti. 


Proof. Choose ore of the points f° belonging to the core of Z as the center 
of Z and suppose f is any point (of G) outside Z. Join f° with f by a straight 
segment. Somewhere, at a point f’, it will cross the border of Z; the part f°f’ 
of the segment, including f’, belongs to Z while the points beyond f’ are out- 
side Z. The point /’ satisfies one of the equations (58), say 


(60) = 0. 


The left member cf (60) is greater than 0 at f*, equals 0 at f’, and hence is 
less than 0 at f. Consequently a point f which satisfies all inequalities 


a(t) 2 0 (cin 7*, k = 1,---,m) 
cannot lie outside 7 [14]. 
We denote by X? the set of lattice vectors (x1, - - - , Xn) for which 
= 1, = = 0. 


X? is a subset of X,. Let p be any number greater than 1 and @ a positive 


a | 
4 
4 
| 
| 
4 
wre 


1940] ‘ ARITHMETICAL EQUIVALENCE 151 


number. Later on we shall have occasion to study the part G(p, a) of G defined 
by the following simultaneous inequalities: 


1 
(611) for every vector (%1,--- , %,) in Xx, 
p 


(612) , %n) 2 ger — ogi for every vector , %,) in x 
[k=1,---,m]. 


G(p, «) is a closed convex part of G which increases with increasing p and go. y 
A point f of G satisfying all these inequalities (61) with the > sign is an inner 
point of G(p, a), as follows by the argument previously applied to Z. The 
domain G(p, 7) contains the cell Z in its interior. I propose to show that with 
pt ,aT7 o it exhausts the whole G. Let f be any point of G. All lattice vec- 
tors (x1, --- , X,) except those of a certain finite set 2 satisfy the inequalities 


%n) > Bee (k =1,---,m) 


and hence (61), whatever the values p>1 and a>. When (x, - - - , Xn) varies 
over the finite set X;,nZ, f(x1, ---, Xn) will assume a least (positive) value q 
guz/pz- Thus al! the inequalities (611), with the > sign and for k=1,---,n, | 
will hold as soon as p> 1, p2,:-~-, Pa. In the same manner one sees that, 
for a sufficiently high o, f satisfies all relations (612) with the > sign for | 
k=1,---,m. 
10. The first theorem of finiteness. We now resume the algebraic study 
of reduced forms, first specializing Theorem 5 for the gauge function f'/?: 


THEOREM 11. Any reduced form f ={g;;} satisfies the inequality ! 
(62) Angi * Sen SD 
where \n= (Wn/pn)?. 


About the constant u, see the Supplement to Theorems 4-6 in §3. We ' 
use the formulas (55) and (56) for the volumes of our ellipsoidal gauge bodies : 
and thr obtain a corresponding inequality 


* Van D 
for reduced Hermitian and Hamiltonian forms, with 
i 3” 1 


certainly not but good. 
In passing we mention the following relations: 


| 
I 
resulting values off7% i 
le 


152 HERMANN WEYL [July 


(63) | S -vii| S rvs, 


which hold for reduc2d forms and for 1<j. Choose two different indices, say 2 
and 5. The two vectors for which x2= +1, x5=1 and all other x; vanish be- 
long to X5; hence 


+ 2g05 + => B55 
or 
2| £25 | S gee. 


In the imaginary ana quaternion cases the procedure is as follows. Let » range 
over all integers. We take &=1 and &=—v7 while all other £; vanish. The 
resulting inequality 1 sads 


— — Ys2 = O 
which for 7 = ¥s2/Y22 yields 


This means thai, in the lattice of integers, y is not farther from zero than from 
any other integer. Hence this distance || cannot exceed r. 


If f(x1, -- , is a reduced form of variables, then 
fP = fim, , 0) 
is one of & variables, therefore 
Combining this with (24) forf“@-», - we find the important 
inequality 
qk = (k=1,---,m) 


holding for reduced forms f. 
We are now sufficiently prepared to prove the first theorem of finiteness: 


THEOREM 12. The 52+ T; of lattice vectors is finite. 


Hence by Theoren 10 we have succeeded in sifting from the infinitely 


many inequalities (5°. 1 finite number on which all others are consequent and 
therefore redundan’ «© proving our proposition we shal! give fairly explicit 
upper bounds of |x:|,---, |xn| for the vector: r=(x1, %2,---, %») in Ty. 


Proof. Suppose rf is in TJ; and f a reduced form for which f(r) =gex. In 
particular r= ¢, fulfitis this demand. We apply Jacobi’s transformation te f 
and then find for the vector in question 


2 
Git = Bek; 


a fortiori 


i 

| 


1940] ARITHMETICAL EQUIVALENCE 153 


2 


j=k 


In the last sum g;2Ajg;;2>Ajgex (7 2%), and thus the inequality 


> dz; < 1 


j=k 

results which yields universal upper bounds for | z:|,---, |Za| : 

To find universal bounds for |2:|, - - - , is a slightly more intricate 
job. Let / be a given index less than k. Without altering xn, - - - , Xa41 we May 
replace x, -- - , x1 by such integers x;*, - - - , xi* in succession that the corre- 
sponding 2;*, - - - , 2i* satisfy 

| S4,---, S34. 
Since the new vector (xi, -- - , x#*, Xa41, °° , Xn) also is in we must have 
f(x, , Ba) S See, 

consequently 


2 2 2 2 
2 2 2 2 
gee = (qidi t+ + + + + 
or 
2 2 2 
+ * + 2 + gata. 
The left member is less than or equal to 
Pit S (gut + gar) S (r = 
Hence 
2 2 2 
r hgnn = = OF 


2 


(65) Sr (h=1,---,k—1). 


A 


(The notation r is used in order to cover also the imaginary and quaternion 
cases. ) 
Applying (65) to r=¢, one gets 


(66) bes Sr (for h < k). 


The universal upper bounds for |Za, anand | z1| together with the universal 
bounds for the moduli of the coefficients 5; in the recursive equations (52’) 
result in universal upper bounds for |x,|,---, |x|. 


— 
= 
| 
| 
| 
' 
i 
& 
i 
4 i 


154 HERMANN WEYL [July 


This argument is chiefly due to Minkowski and is in my view the back- 
bone of his theory of reduction. The simple remark leading from (65) to (66) 
was first made by Remak [15]. It dispenses with the necessity of making use 
of the explicit expression of b,, as Minkowski did, which is the more fortunate 
as it would have been quite cumbersome to follow his procedure in the qua- 
ternion case. 

11. The second theorem of finiteness. Generators of the modular group. 
We prove now the following theorem. 


THEOREM 13. The set G(p, o) has points in common with not more than a 
finite number of cells Zs. 


We must show that there is only a finite number of unimodular substitu- 


tions S capable of carrying an (unspecified) point f of Z into a point f’ of 
G(p, ), 


(9181 Yn8n) = f'(n, Yn). 


Here (81, - - - , 8n) is a lattice basis of the property B(p, - - - , p) with respect 
to f!/?. Consider the two series of subspaces 
Eo, [ei], Es = [es, ee],---, En; 


EY , E{ EZ 82], EX 


E,=Ej is the zero space, E,=E,! the full vector space. Let J be the highest 
of the indices 1, - - - , for which 


(67) Ein = Ev. 


The decision whether or not Eg = E, depends merely on checking whether 
some integers are zero. For / there exist the possibilities ]=1,---,m. We 
propose to consider the S with a definite /. 

First we focus our attention on the vectors 


(68) (k=1,---,n). 
For the moment let r= (x, - - - , Xn) denote the vector 8; then 
(69) f(t) = qitit + = gir. 
Put 
= for = pi =p. 


Because of the significance of / and Theorem 9, we have 


(0/ 0:41)" gis 


for 121. Therefore and because f is reduced, 


| 
| 


1940] ARITHMETICAL EQUIVALENCE 
2 2 2 2 


(70) = gkk (0-101)? 


LY 2 
+ 


while by Theorem 8, 


Combining the two inequalities (70) and (71) with (69), we get hold of uni- 
versal upper bounds for |z;|,---, |z,|, namely 


So far we have used merely the first set (611) of inequalities for f’. 
The second set yields universal bounds for |2:\, - 


-, |ga|. Suppose 
Y1,°**» Yi-1 to be any integers; we have 


which is equivalent to 
or 
(72) f(xit, vita, tn) te) — ogn 


where (x1, - - - , Xn) again is the vector 8, and x¥, - - - , x1 denote any inte- 
gers. In fact 
= chi = trait rin 


with 


Observe that 81, - - - , span the lattice in so that x{,---, and 
therefore xi, - - -, range independently over all integers while yi, - - - , 
do so. Let h be one of the indices 1, - - - , 1—1, and choose x¥ =x; for 1>h, 
but x*,---, xi* such that 


a! of | 
| 
| 
i 
i 
i 
al 
i 


HERMANN WEYL [July 


lat| <r,---, [etl sr (r = 3). 


Then (72) yields : 


’ 2 2 2 2 2 2 


The left member is less than or equal to 
+ + + gan) S (op? + gan; 
thus 
2 2 2 
Sop +trh (h=1,---,2—1). 


Hence we have obtained universal bounds for all | z;| and by means of (66) 
also for all | x;| . In other words, for each of the lattice vectors (68) we find our- 
selves limited to a finite set from which to choose. 

If =1 nothing remains to be said. In the opposite case the same situation 
prevails for the “cut” forms 


» Baus, - , 0), » Bana, 0, , 0) 


of 1—1<n variables in E,_,; as for the full forms f and f’ in m dimensions 
which we started with. Thus the proof is complete by induction. 

The main idea of the proof is again borrowed from Minkowski—with two 
essential modifications: - 

(1) Where Minkowski uses estimates based upon Jacobi’s transformation 
of quadratic forms, we have availed ourselves of the general Theorems 8 and 9 
holding for any gauge function whatsoever; in spite of their far greater gen- 
erality these estimates are sharper than Minkowski’s. 

(2) Minkowski has our proposition only for 


p=1, ¢=0, G(p,c) =Z, 


in which case it asserts that Z borders on not more than a finite number of 
equivalent cells Zs. However, we should know that every boundary point of Z 
is on the common boundary of Z and a different cell Zs, or that the cells Zs 
cluster only towards the border of G, which means that into any sufficiently 
small neighborhood of a point of G, or into any compact subset of G, there 
penetrate only a finite number of cells Zs. Our theorem goes beyond this be- 
cause G(p, 0) exhausts G if p T ©,0 7 ©, but is not compact. About this finer 
point refer to §13. Here is an application of the fact that the cells do not 
cluster in the interior of G: 


Lemma 5. Any cell Z'=Zs may be reached from the basic cell Z by a chain 
(73) 


in which any two consecutive members are in contact, i.e., have points in common. 


ie 
i 
156 
- 
; 


1940] ARITHMETICAL EQUIVALENCE 157 


Proof. The center f° of Z goes by the substitution S into an inner point f§ 
of Zs=Z’'. Join f° with fg by a straight segment r. Determine p>1 and ¢>0 
so that fs is an inner point of G(p, 7). Then the whole segment 7 lies in G(p, c). 
Since the number of cells Zs having points in common with G(p, a) is finite, 
the same is true a fortiori for the cells Zs which are met by the segment r. 
On the other hand every point of r belongs to a certain cell Zs, and the points 
which 7 and Zg have in common form a (closed) interval on r. Hence 7 is _ 
covered by a finite number of subintervals of which we can select a chain 
connecting f° with fg. What we obtain in this manner is a chain of cells (73) 
in which any two consecutive members have a contact point on r. 

Those substitutions of the modular group which effect transition from Z 
to cells in contact with Z form a finite set [Z]. If Sis in [Z], so is the “(two- 
sided) congruent” substitution 


S* = JSJ' (J, J’ any two elements of {/}) 


as one readily verifies by performing the substitution J’ on the two contact- 
ing cells Z and Zy;s=Zs. Hence [Z] breaks up into a number of complete sets 


of congruent substitutions; we choose a representative out of each set: 
S’, pals: 


THEOREM 14. The substitutions of {S } which carry Z into cells bordering 
on Z, or rather a complete system of modulo {J } incongruent representatives 
S’, ++ among them, combined with {J}, generate the whole modular group 


fs}. 


Proof. Let S be any element of the modular group and determine a chain 
(73) leading from Z to Zgs=Z’. A certain unimodular S;' will carry Z; into 
Z and Z;4: into a cell contacting Z which therefore arises from Z by an 
element S“ of [Z]. The substitution SS, carries Z into Z;,: and thus can 
and shall be adopted as S;4:. If this inductive definition of S; is started off 
with S, the identity, then S°-» - - - S® carries Z into Zs, and therefore 


S = JSO-) ... (J in {7}). 


12. Faces and walls. The main body of the theory of reduction is now 
complete; what follows are accessories of minor importance. In this section 
we discuss the consequences upon the cell configuration of the fact that any 
boundary point of a convex solid polyhedron lies on one of its faces. Engaging 
in this kind of general topological argument, we prefer the notation 1, - - - , yw 
instead of g;; for the coordinates in our N-dimensional space R. A face of 
the cell would be described by one of the equations 


(57) ax(r) (r in k= 1, n), 


which hold for N—1 linearly independent points of Z. Taking it for granted 
that each boundary point of Z lies on a face, we infer from the proof of 


| 


158 HERMANN WEYL ; [July 


Theorem 10 that the corresponding inequalities suffice to define Z as a part 
of G: those planes (57) which do not share an (N—1)-dimensional convex 
face with Z may be discarded. It is clear that on account of their “extreme” 
character the remaining inequalities are truly indispensable. 

As to the configuration of all equivalent cells Zs, it seems clear that any 
point on the boundary of Z lies on a “wall” separating Z from an “adjacent” 
cell Zs. By these words “wall” and “adjacent” we wish to indicate that Z 
and Zs have N—1 linearly independent points in common. The points which 
two cells have in common, if any, form a convex cone of 1 or 2 or --- or 
N-—1 dimensions. We speak of a contact of order 1, 2,---, N—1 respec- 
tively. The unimodular S carrying Z into adjacent cells form a finite set 
[[Z]] narrower than [Z]. Again it decomposes into subsets of congruent sub- 
stitutions. Theorem 14 remains true if S’, S’’, - - - denote representatives of 
these sets. We can dispense with none of these more restricted generators. 

The ultimate goal of all such considerations should be to show that the 
pattern of our cells which mutually border on each other is a complex in the 
combinatorial topological sense, of such particular structure as to form the 
skeleton of a manifold. 

It is clear that the walls of Z are parts of its faces. This simple observation 
establishes a close relationship between the first and Minkowski’s special case 
of the second theorem of finiteness. 

I shall try to give the most convenient arrangement of the proofs. First 
the faces of Z. ; 


LEMMA 6. Any boundary point of Z lies on a face of Z. 


We know that Z as a part of G is characterized by inequalities 


(74) aly) = 20 


corresponding to a finite set 2 = Zp of linear forms a(y). Let f! be a point 
(of G) on the boundary of Z; it will satisfy at least one of the inequalities of 2 
with the = sign. After an appropriate linear transformation of the coordi- 
nates y; we may assume 


(75) fi =e! = (1,0,0,---, 0). 


21 is the non-empty subset of 2 to which a linear form a(y) belongs if nullified 
by e'. Their first coefficient a, vanishes, so that they may be looked upon as 
forms of N —1 variables. For the linear forms a(y) in the complementary sub- 
set Lp the first coefficient a: is positive. We describe the vth step of this proc- 
ess of selection. Suppose the subset 2, of those linear forms of 2 in which the 
variables 1, - - - , y, are absent is not empty. The corresponding inequalities 


+ anyw 2 0 


of 2, define a convex pyramid Z’ in the (N—v)-dimensional space R’ with 


sta 
4 
| 
{ 


1940] ARITHMETICAL EQUIVALENCE 159 


the coordinates 9,41, -- - , yw. As long as N —v 22, we can find a point f’*! #0 
on the boundary of that pyramid, and by a suitable affine transformation of 
the coordinates y,41, - - - , yw we can provide for f’+! having the coordinates 


yw) = (1,0,--- , 0). 


>, breaks up into the subsets 2,,: and 2, whose members have their first co- 
efficient a,4:=0 and >0 respectively. Z,,1 is not empty. 

The existence of f’*+! follows in this way. Denote by f= (yf, - - - , yf) the 
center of the cell Z. All linear forms a(y) belonging to Z, have the property 
a(f°) >0 for 


(76) f (you, yw), 


or (76) is an inner point of Z’. Operating in the (N—v)-dimensional space R” 
we choose one of the forms cf 2,, say a’(y), and a point f ¥0 in the plane a’ (y). 
(As long as K” has at least two dimensions, a plane a’ (y) = 0 through the origin 
O certainly contains points f#0.) We join f° with f by a straight segment, 
which will not contain the origin O. Traveling along the segment from f°* 
to f we encounter a first point f* where one of the forms of 2, ceases to be 
positive. (If not before this will happen for f.) All forms of 2, are greater 
than or equal to 0 for f* and at least one equals 0. We take f’t! =f*. 

We end up with a non-empty set 2w-_: consisting of linear forms awyy in 
the 1-dimensional space R*—' with the single coordinate yy. They are positive 
for yw =yy. We take one of them as the coordinate yy; then the coefficients ay 
of the others are greater than 0 and yy20 is the pyramid Z%—" in R¥-. At 
the same time we have arrived at a complete normalization of the affine sys- 
tem of coordinates yi, --- , yw. 

By construction the pyramid Z’—! in R’—! contains the point 


yw) = (1,0,--- , 0). 


The system 2,_, of linear forms 


+ + an yn 


splits into 2, and 2,_; according to the condition a, =0 or a,>0. It is there- 
fore easy to ascertain a positive constant ¢,<1 such that (1, y,41,---, yw) 
lies in Z’-! provided (y,4:, -- , yw) lies in Z” and 

| | w| se. 


This is true even at the first step y=1 when R°=R is restricted to G, because 
for a sufficiently small ¢ the neighborhood of (75) described by 


y= 1, lye|Se---, lw|se 
lies in G. 
Starting with the point yy =0 in Z*—' and following this rule for the tran- 


| 
| 
| 
} 
{ 


160 HERMANN WEYL [July 


sition Z’—>Z’—! backwards from Z*— to Z, we find that the following N—1 
points 
(1, 0, 0, 0,---, 9), 


(1, €1, 0, 0,---, 90), 
(1, €1, €1€2, 0, 0), 


belong to Z. Thus the plane yy =0 belongs to Zy-1, hence to &, is a face and 
contains the point f'. 


LEMMA 7. Any cell Z'’=Zs may be reached from the basic cell Z by a chain 
whose consecutive members are adjacent. 


The inner reason for this lemma is obvious: because the region G is con- 
vex, the cell complex into which it has been divided is connected. 

We start with the chain described in Lemma 5. Any two of its consecutive 
members have a common point f situated on the segment 7; but in general 
their contact will be one of order 1 only. We must insert further cells between 
them to make the chain proceed by contacts of order N—1. 

The point f, being common to two cells, is not an inner point of a cell. 
I shall try to describe the situation intuitively in the plane section ga, =1 of G. 
The cells to which f belongs cover an entire neighborhood U of f, each of them 
participating in it by atl (V—1)-dimensional pyramid with vertex f. Hence 
we obtain a division of the (N—1)-dimensional space R! into a finite number 
of convex pyramids radiating from the vertex f, and our task is to prove that 
this complex is connected. We thus face the same problem as before, but in 
one dimension less, and hence induction with respect to N will lead to the 
desired result. Let us now repeat the argument in detail, again using the nota- 
tion yi, -- +, yw instead of g;; for the coordinates in R. 

Not more than a finite number of cells Zs penetrate into a neighborhood U 
of f which lies in G(p, o). If one of these cells does not contain f, then U may 
be shrunk so as to have its intersection with the closed Zs empty. Hence we 
find a smaller neighborhood of f, again called U, into which none but cells Z, 
containing f will penetrate. We choose the coordinates y; such that 
f=(1, 0, 0,---,0). A cell Z; is defined by a finite set 2 of inequalities (74) 
which as before is divided into the subsets 2; and 2; and as has been shown 
above, any point (1, ye, - - - , yw) sufficiently near to f, if it satisfies merely 
the inequalities 2, will lie in Z;. The inequalities 2; define a convex pyramid 
Z}” in the (N—1)-dimensional space R! with the coordinates ye, - -- , yw. 
The center (y?, - - - , yy) of Z; gives rise to a center - - - , yy) of Z/?. Thus 
the Z,; determine a division of R! into a finite number of pyramids Z!, and 
our aim is to prove the connectivity of that assemblage. Let us formulate 
this assertion as a lemma for N instead of N—1 dimensions. 


te 
‘ 
+ 
i 


1940] ARITHMETICAL EQUIVALENCE 161 


LEMMA 8y. Suppose the N-dimensional space R divided into a finite num- 
ber of convex pyramids II with their common vertex at the origin O. Each of them 
1s supposed to contain inner points. Then any two of them can be joined by a 
chain whose consecutive members have contacts of order N—1. 


The argument employed to reduce Lemma 7 to 8y_1 may be used equally 
well to reduce 8y to 8y-1 and thus to prove 8y by induction. The case is 
somewhat simpler because we now deal with a finite set of cells from the be- 
ginning. There is a slight complication, however, in so far as the Euclidean 
N-dimensional space robbed of the point O is not convex, but it is still con- 
nected as long as N22, and that is what counts. Indeed the centers of any 
two of our pyramids can be joined by a line consisting of one or two straight 
segments without passing through O. 

As a consequence of Lemma 7, Theorem 14 is sharpened to 


THEOREM 15. A complete system of modulo {J} incongruent substitutions 
S which carry Z into adjacent cells generates the whole modular group when one 
combines them with a system of generators for { J } . 


13. Concluding remarks. Observe that a reduced form f satisfies the in- 
equality 


(77) 2 gu 


not only for integers x1, - - - , x, without a (left) common divisor, but for any 
integers (x1, - , Xn) - - - , 0) whatsoever. This is nothing else than the 
equation M,. 

In this final section we are going to study the cell Z of reduced forms rela- 
tively to the whole N-dimensional space R rather than G. 

The cell Z as a subset of R is not (necessarily) closed; boundary points f 
which do not belong to G will be semi-definite forms in the sense that f(r) 20 


for every vector r, but f(r) =0 for certain vectors #0. Such a form can be 
written as a square sum 


of m<vn linear forms 2, - - - , 3m of the coordinates x; with real coefficients. 
Now if € is any pre-assigned positive number we can ascertain a lattice vector 
(x1, °° Xn) 0) for which 


|u| Se 


and thus 
(78) f me. 


This is accomplished either by Minkowski’s inequality (1) for a parallelo- 
tope or by an easy application of Dirichlet’s principle concerning the distribu- 


{ 
j 
| 
| 
> 
4 H 
| | 
| 
ii 


162 HERMANN WEYL 


tion of y+1 objects in v boxes. But (78) contradicts (77) unless 


su = 0. 


Because of the relations (63), 


| giz | | gin | S rgu, 


which will extend from the forms in Z to those on the boundary of Z, the lat- 
ter will satisfy the m equations 


(79) gn = = gn = 0. 


(Even an appeal to the inequality g?, < gu: gi: valid for all positive forms would 
have sufficed here.) 

The closure Z of Z in R has each of its boundary points either on one of 
the planes formerly assembled in the finite set 2 or on the plane gi. =0. Hence 
Z as a part of R is completely described by the inequalities 2 together with 
gu 20 and therefore is a pyramid. For n=1, the set 2 is empty and we have 
the one inequality gu 20. We may safely ignore this trivial case. For n22, 
Z reaches the boundary of G only along the “edge” (79) of m dimensions less; 
hence gu =0 is no face of Z, and the inequality gi. =0 is redundant. Therefore: 


THEOREM 16. The same finite set of inequalities which defines Z in G defines 
Z in R. The boundary points of Z which do not belong to Z lie on the edge (79). 


The vertices of Z are the so-called extreme forms; every reduced form is a 
linear combination of them with non-negative coefficients, but some of the 
extreme forms will be semi-definite. 

We can now more fully appreciate the fine points in our two theorems of 
finiteness. By excluding from Z an arbitrarily small neighborhood 


Vi: £11 < 


of the “edge” we obtain a compact subset Z, of G(*). The fact that the bound- 
ary points of Z which lie outside this neighborhood V, belong to a finite 
number of plane faces is considerably less deep than our first theorem of 
finiteness, and so is its proof. When one excludes V,, one could have used the . 
region G(p) defined by the first set of inequalities (61) alone instead of G(p, c), 
and could have shown that G(p) possesses not more than a finite number of 
plane faces outside V., while this is not true for G(p) or G(p, «) as a whole. 
And the second theorem of finiteness could have been replaced by the less 
profound and more easily accessible assertion that there.is only a finite num- 
ber of S capable of carrying a point of Z outside of V, into a point of G(p) 
outside V,.. These statements would have sufficed for the topological analysis 


(?) Compact under the convention that proportional forms like f and tf (¢>0) are identified. 


iva 
[July 
4 
i 
} 
\ 


1940] ARITHMETICAL EQUIVALENCE 163 


in §12. Our two theorems of finiteness include the approach to the “edge” 
and thus reveal finer features which are of great interest to the algebraist, 
though perhaps of less important from the topological standpoint. 

Up to now positive quadratic forms have been the object of investigation. 
Instead one can study arbitrary affine coordinate systems [16] in an n-dimen- 
sional vector space, consisting of m linearly independent vectors a1, --- , Gn; 
these new objects form an n-dimensional space Y%. Two such systems 
(a1,---, Qn) and (b,---, b,) are said to be (arithmetically) equivalent if 
connected by a unimodular transformation S, 


b= >> sie (sp integers, det (sz) = +1). 
k 


For any vector r= (x1, - - - , X,) we introduce its square 
2 2 2 


(in accordance with Euclidean metric geometry) and associate the positive 
form 


with the coordinate system (a1, - - - , a,)(*). The latter is said to be reduced 
and to belong to the “cell” 3 of & provided the associated form f is reduced. 
3 is a fundamental domain for the group {S} in %, and we could interpret 
our whole theory in terms of the new objects. The quadratic forms are then 
merely a tool for the study of coordinate systems under the rule of unimodular 
equivalence. We have thus returned to the approach of Chapter I: What we 
now call a reduced system (a, -- - , Gn) was there termed a reduced system 
with respect to the gauge function 


2.1/2 


A similar shift of viewpoint is applicable to the imaginary and the quaternion 
cases. 


BIBLIOGRAPHY 


1. Journal fiir die reine und angewandte Mathematik, vol. 129 (1905), pp. 220-274; also 
Gesammelte Abhandlungen I1, Leipzig, 1911, pp. 53-100. Cited as M with the page number in 
the Gesammelte A bhandlungen. 

2. Sitzungsberichte der Preussischen Akademie der Wissenschaften, 1928, pp. 510-535; 
1929, p. 508. 

3. Quarterly Journal of Mathematics, vol. 9 (1938), pp. 259-262. 

4. H. Weyl, On geometry of numbers, soon to appear in the Proceedings of the London 
Mathematical Society. On the whole subject see H. Hancock, Development of the Minkowski 
Geometry of Numbers, New York, 1939. 


(*) The inequality (54), gu - ++ gan2D, for (80) reads in this interpretation as follows: The 
volume of a parallelotope cannot exceed the product of the lengths of the vectors by which it 
is spanned. 


164 HERMANN WEYL 


5. Another short proof by H. Davenport, Quarterly Journal of Mathematics, vol. 10 
(1939), pp. 119-121. 

6. Compositio Mathematica, vol. 5 (1938), pp. 368-391. 

7. Cf. Minkowski’s definition in M, p. 59. 

8. See Mahler, loc. cit. (3 above), and the author, loc. cit. (4 above), Theorem v. 

9. Weyl, loc. cit. (4 above), “Generalized Theorem V.” 

10. See M, pp. 56-58. 

11. For more details see L. E. Dickson, Algebren und ihre Zahlentheorie, Ziirich, 1927, 
chap. 9; C. G. Latimer, American Journal of Mathematics, vol. 48 (1926), pp. 57-66; M. Deur- 
ing, Algebren, Ergebnisse der Mathematik, vol. 4, no. 1, Berlin, 1935, chap. 6. 

12. Vorlesungen iiber die Zahlentheorie der Quaternionen, Berlin, 1919. 

13. The larger part of E. H. Moore’s “Algebra of Matrices” (General Analysis, Part I, 
Memoirs of the American Philosophical Society, Philadelphia, 1935) deals with the formalism 
of “Hamiltonian” forms. 

14. Cf. Weyl, loc. cit. (4 above), §8, and the more complicated argument in Bieberbach- 
Schur, loc. cit. (2 above), pp. 521-523. 

15. Loc. cit. (6 above), equation ‘ame 

16. See M, p. 53. 


INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


| 
as 
> hh 
1 
1 
j 
‘ 
14 
4 
He 
4 
ty 


CONTINUED FRACTIONS AND TOTALLY 
MONOTONE SEQUENCES 


. BY 
H. S. WALL 


1. Introduction. A sequence Co, ¢1, G2, --- of real numbers is called totally 
monotone if Ac, 20, (m, n=0, 1, 2,--- ), where 


=n — Cm,1Cn41 + Cm,2Cn42 + (~ 1) “Cantera: 


Hausdorff(*) showed that for every totally monotone sequence Co, ¢1, C2, - 
there exists (essentially uniquely) a monotone nondecreasing real function 
o(u), OSuS1, such that 


1 
= f n = 0,1,2,°: 
0 


Conversely, if @(u) is a monotone nondecreasing bounded real function on the 
interval OSu<1, then Ac, =f, (1—u)™u"do(u) 20, m, 1, 2,---,s0 
that Co, C2, - - - is totally monotone. 

In case the function ¢(u) has an infinity of points of increase in the inter- 
val 0Su3X1, then the corresponding sequence is a special Stieltjes moment 
sequence, and accordingly there is a Stieltjes continued fraction (?) 


(1.1) + bax/1 + 


in which the numbers y, be, bs, - - - are real and positive, which corresponds 
to the power series 


(1.2) 


On the other hand, if @(u) has but a finite number of points of increase, then 
the series (1.2) represents a rational function of x and the continued fraction 
terminates. 

The main problem which we have solved in the present paper is as fol- 
lows: to find necessary and sufficient conditions upon the numbers by, be, bs, - 
in the continued fraction (1.1) in order that the coefficients co, C1, C2, --- in the 
corresponding power series (1.2) shall form a totally monotone sequence. The re- 


Presented to the Society February 24, 1940; received January 30, 1940. This paper is 
dedicated to Edward Burr Van Vleck on the occasion of his seventy-seventh birthday, June 7, 
1940. 


(*) F. Hausdorff, Ueber das Momentenproblem fiir ein endliches Interval, Mathematische 
Zeitschrift, vol. 16 (1923), pp. 220-248. 

(*?) T. J. Stieltjes, Recherches sur les fractions continues, Oeuvres, vol. 2, pp. 402-566. We 
have made the substitution of x for 1/z, and have put b: =1/a1, bn =1/@n@n_1, » 22, in the series 
and continued fraction used by Stieltjes. 


165 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRAR 


166 H. S. WALL : [September 


sult is very simple, namely: the sequence ¢o, C1, G2, --- is totally monotone 
if and only if there exist real numbers go, gi, g2,--- such that O<g,<1, 
n=0, 1, 2,---, and such that 


Co — + gx? — +--+ go/1 + gix/1 + (1 — gi)gex/1 
+ (1 — 


it being agreed that the continued fraction(*) shall terminate with the first 
identically vanishing partial quotient. 

If co, C1, C2, - + - is totally monotone, the function f(x) represented by the 
power series (1.2) is analytic for |x| <1. Let M(f) =1.u.b.)21<1|f(x)|. We show 
that M(f) is finite if and only if the series co+ci+c2.+ --- converges, and es- 
tablish the equality 


M(f) =cotatet: 


We also characterize the class E of these “moment generating functions” 
which are bounded in the unit circle (a) in terms of the Stieltjes integral 
representation of f(x), and (b) in terms of the continued fraction representa- 
tion of f(x). It is shown that if f(x) « E, M(f) $1, then the functions defined 

by the algorithm of Schur(*), namely: 

1 tn = fu 

infin. tn f,(0), 

n=0,1,2,---,fo=f,areallin Zand have moduli not exceeding 1 for | x| <1. 
2. An operation on continued fractions. We shall use the symbol “~” be- 
tween a power series P(x) and a continued fraction K(x) to indicate that the 
power series expansion of the mth approximant of K(x) agrees term by term 
with P(x) for more and more terms as 7 is increased, or becomes identical 
with P(x) from and after some value of m. The basic theorem of the paper is 


THEOREM 2.1. Jf g1, go, g3, -- - are any real or complex numbers, and P(x) 
is a power series in ascending powers of x such that 


(2.1) P(%)~ 1+ gix/1 + (1 — gi)gex/1 + (1 — ge)gsx/1+---, 
then 


1 


(*) Continued fractions of this form were first treated by E. B. Van Vleck, in a paper en- 
titled On the convergence and character of the continued fraction az/1-+-a22/1+a;2/1+ --- , these 
Transactions, vol. 2 (1901), pp. 476-483. 

(*) J. Schur, Ueber Potenzreihen, die im Innern des Einheitskreises beschrinkt sind, Journal 
fiir die reine und angewandte Mathematik, vol. 147 (1916), pp. 205-232, and vol. 148 (1917), 
pp. 122-145. 


i 

j 

| 

| 
{ 

\ 


1940] CONTINUED FRACTIONS 167 


Proof. Let A ,(x)/B,(x), Az(x)/B#(x) be the mth approximants of the con- 
tinued fractions in (2.1) and (2.2), respectively. Then we have the relations 


(2.3) An(x) = + Bat(x), (1 + %)Ba(x) = gavAnti(x) + 
(n=0,1,2,---, go=1,A*,=1, B*,=0). These may be verified directly for 
n=0, 1. Assuming that the first is true for »Sm, m21, we then have 
A m4i(%) = Am(x) + — &m) XA m_i(x) 

= + + gmoi(l — + Buta(x)]. 
Since = B,* (x) we then have 

A m41(*) = + + gm(1 — | 

= + Bmii(x), 
so that the first relation (2.3) is true for »=m-+1, and therefore, by mathe- 
matical induction, for all m. The second relation (2.3) may be proved in a 
similar way. 
On multiplying the first relation (2.3) by A,*:(x), the second by B,*:(x), 


and then subtracting, we find that the power series expansion in ascending 
powers of x of the rational function 


A,(x)/B,(x) 


begins with the (n—1)th or a higher power of x. It follows immediately that 
the correspondence (2.1) implies the correspondence (2.2). 
This theorem may be thrown into the following form: 


THEOREM 2.2. Let co+0, and 
Co — C10 + — + + (1 — gidgex/1 


(2.4) 
+ (1 — 


Then 
Aco — Acyx + Acex? — --- ~ Aco/1 + gi(l — ge)x/1 
+ go(1 — gs)x/1+---. 


Proof. Let P(x) =cy—cix+cox?— --- in Theorem 2.1. This gives at 
once 


Co + (Co — — (C1 — +--+ + co(l — gi)x/1 
+ gi(1 — go)x/1 +---. 


On removing the constant term co from the series and from the continued 
fraction, and then dropping a factor x, the correspondence (2.5) results. We 
note for future reference that 


(2.5) 


168 H. S. WALL : [September 


(2.6) Aco = £1). 


By means of these theorems we have enlarged by one the small list of 
known operations on continued fractions. 

The next theorem makes the transformation available for a large class of 
continued fractions. 


THEOREM 2.3. Let An(x)/Ba(x) be the nth approximant of the continued 
fraction 


(2.7) 1 + ayx/1 + aex/1 + agx/1+---, 

and let c¥0 be any number such that A,(—c) #0, (n=1, 2, 3,---). Pui 
(2.8) = OnCAno(— ¢), 
n=1,2,3,---,As(—c)=1. Then the continued fraction (2.7) takes the form 


gi(x/c) (1 — gidge(x/c) (1 — ge)gs(x/c) 


1 
7" i+ 1 + 1 + 


Proof. One may verify immediately that a:=g:/c, @n=gn(1—Zn-1)/c, 
n=2,3,4,---, when the g,’s are given by (2.8). 

We remark in passing that if a,0 in (2.7), and P(x) is the power series 
corresponding to (2.7), then one may apply Theorem 2.1 to obtain the for- 
mulas given by Stieltjes(*) for the continued fraction corresponding to 


1/P(x). In fact: 
1+ (1 — gi)(x/c) 1 — gi)(x/c) 
P(x) 1 te 1 
— g2)(x/c) 
+ 1 + 
We may allow c to become infinite and the correspondence will be maintained 


provided the coefficients of x in the continued fraction have limits which are 
finite and not 0. Accordingly we obtain this theorem: 


(2.9) 


THEOREM 2.4. If the power series P(x) has a corresponding continued frac- 
tion (2.7) in which a, #0, n21, then 1/P(x) will have a corresponding continued 
fraction of the same form provided Azn(x), the numerator of the 2nth approxi- © 
mant of (2.7), is of degree n forn=0,1,2,---. 


The condition of the theorem is met when the a,’s are real and positive, 
which is the case with which Stieltjes was concerned. If one evaluates the 
limits for c= © of the coefficients of x in the continued fraction of (2.9), he 
will find that the result agrees with that found by Stieltjes. 


(®) O. Perron, Die Lehre von den Kettenbriichen, 1st edition, pp. 334-335. 


i 
} 
| 
4 
i 


1940] CONTINUED FRACTIONS 169 


3. A uniform convergence theorem. In a recent(*) paper Scott and Wall 
proved a theorem which may be stated in the following form: 


THEOREM 3.1. If g:, ge, gs,--- are real numbers such that 0<gi<1, 
0Sg,<1,>1, and x1, x2, x3,--- are functions of any variables, then the con- 
tinued fraction 


(1 — gi)gex (1 — ge)gsxe (1 — gs)gaxs 
1 + 1 + 1 + 1 + 


converges uniformly for | xn| <1, n=1, 2, 3,---. The denominaiors of all the 
approximants are nonzero in this domain. Let G denote the value of the continued 
fraction. Then 


(3.1) 


1 


gn 
1 
+d (1 — gi)(1 — ge) -- (1 — gn) 


af |x,| 1, n21; and G ts equal to the expression on the right if x, = —1, n21. 


1 
81 


We shall digress momentarily at this point to discuss two theorems given 
by Perron on page 262 of his book. The first of these may, with no essential 
loss in generality, be stated as follows: 


THEOREM 3.2. If the elements of the continued fraction 
(3.2) 1/1 + a2/1 + + a/1+--- 


are functions of any variables, then the continued fraction converges uniformly 
over the domain characterized by the inequalities 


| | S (pn — 1)/Pnpa-r, n-= 2,3,4,---, 


where pi, po, ps, -- are any real constants greater than 1 for which the series 


ts divergent. 


Perron then says: “Ein bemerkenswerter Spezialfall unseres allgemeinen 
Kriteriums ist” and then proves a theorem, attributed to Van Vleck, which 
may be stated as follows: 


THEOREM 3.3. If g:, are real numbers such that 0<g,<1, 
and x1, %2, X3,--- are functions of any variables, then the continued fraction 
(3.1) converges uniformly for | xn| <1, 


(*) W. T. Scott and H. S. Wall, A convergence theorem for continued fractions, these Transac- 
tions, vol.-47 (1940), pp. 155-172. 


| 


170 H. S. WALL . [September 


It is a strange fact that the second theorem is more general than the first. 
To see this, put @, = gn(1—Zn-1)%n, M22; Pn =1/(1—gn), m21, and the second 
theorem reduces to the first minus the requirement on the series (3.3). 

It should be added that Theorem 3.3 is related to but quite different from 
the theorem which Van Vleck proved(’). He gave preference to the continued 
fraction (1 ---, the reciprocal of 
which is, except for an unimportant term and factor, the continued fraction 
(3.1). Theorem 3.1 is an improvement over Theorem 3.3, in that the g,’s 
after the first are permitted to be 0. For this reason Theorem 3.1 contains 
the theorem given by Perron on page 258 (Theorem 26), in which it may be 
assumed with no loss in generality that p,>1. 

4. Totally monotone sequences corresponding to an “infinite distribution 
of mass.” The sequence c,= S-undd(u), n=0, 1, 2,---, in which ¢(w) is real 
and monotone nondecreasing is completely characterized by the inequalities 


(4.1) i, & @, m,n =0,1,2,---, 


and is said to be a totally monotone sequence. If ¢(u) has an infinite number of 
points of increase, we shall say that there is an infinite distri5ution of mass. 


THEOREM 4.1. The sequence Co, C1, G2, °-° is a totally monotone sequence 
corresponding to an infinite distribution of mass if and only if there exist real 
numbers g1, 2, such thatO0<g,<1, 21, and such that 


0 — 61% + — + gix/1 + (1 — gi)gax/1 
+ (1 — ge)gsx/1+---, (co > 0). 


Proof. Suppose that (4.2) holds. Then by Theorem 3.1 the continued frac- 
tion 1+g:%/1+(1—gi)gex/1+(1—ge)gsx/1+--- converges uniformly for 
| x| <1. If f(x) is the analytic function represented, then by Theorem 2.1 
and Theorem 3.1, (1+x)/f(x) is analytic for | x| <1, and therefore ¢o/f(x), 
the function represented by the continued fraction and series (4.2), is analytic 
for |x| <1. 

Now the coefficients of x in the continued fraction (4.2) are positive, and 
hence by the work of Stieltjes, this continued fraction represents a function 
of the form Jdo(u)/ (1+xu), where ¢(u) is monotone nondecreasing, and 
has an infinite number of points of increase. Inasmuch as this function is 
analytic for | x| <1, and the corresponding continued fraction converges uni- 
formly for |x| <r where 7 is any positive number less than 1, it follows 
that the upper limit of integration may be taken equal to 1. Then 
Ac, = Fi (1—u)"u"dp(u), and therefore (4.1) holds. Thus the sequence is 
totally monotone, and corresponds to an infinite distribution of mass. 

Conversely, let c, =f x u"do(u), where ¢(u) is monotone nondecreasing and 
has an infinite number of points of increase. That is, Co, 1, G2, -- - isa totally 


(7) See footnote 3. 


(4.2) 


i 

{ 

i 

i 

t 

4 

| 


1940] CONTINUED FRACTIONS 171 


monotone sequence corresponding to an infinite distribution of mass. Then, 
by the work of Stieltjes, we must have a correspondence of the form 


Co — 61% + Cox? — + aex/1 + asx/1+---, 


where 4, d2, ds, - - - are real and positive. Moreover, the function represented 
by this series and continued fraction is £ do(u)/(1+xu). Since the limits of 
integration are from 0 to 1, the zeros(*) of B,(x), the denominator of the mth 
approximant of the continued fraction, are real and less than —1. Since 
B,,(0) =1 it therefore follows that B,(—1)>0. We may then apply Theorem 
2.3, with c=1, to the continued fraction 


1+ + + +--+, 


the numerator of whose mth approximant is B,4:(x), and thus determine num- 
bers gi, ge, gs, , such that 


= Gn = Zn—i(1 — gn-2), n> 2. 
Now, by Theorem 2.2, 
o>} 1 + 1 + 


Also, Ac, = where ¢;(u) = (1 —u)dd(u) is monotone nondecreas- 
ing and has an infinite number of points of increase. It follows that the co- 
efficients of x in the last continued fraction must all be positive, and that 
Aco =¢0(1—gi) >0 (cf. (2.6)). We therefore have 
$1 0, £n—1(1 £n—2) > 0, n> 
g1 > 0, 8n—2(1 > 0, n> 


Aco — Acyx + Acex? 0 


and consequently 0<g, <1, m=1, 2, 3, - - - , as was to be proved. 
5. Developments from the continued fraction algorithm. We shall begin 
by considering an example. It is known(®) that the series 


a a(a + 1) 
F(a, 1,7; — x) = «+ 
v(y + 1) 


has the corresponding continued fraction 1+¢.x/1+¢x/1+ex/i+---, 
where 


(a + + — 1) 


= 
Y (y + 2n — 2)(y + 2n — 1) 
Naturally a, y are not negative integers or 0. We then readily find that 


(8) Perron, loc. cit., p. 368 and p. 383. 
(*) Perron, loc. cit., p. 348. 


H. S. WALL [September 


1 i- x i- x 
(1 — gidgox (1 — ge) gs 


where gon =n/(y+2n—1), = (a+n—1)/(y+2n—2), n=1, 2,3,---.On 
applying Theorem 2.1 we readily obtain the power series identity 


+ F(a, 1,7 + 1; — x). 


5.1 F(a, 1, > = 


The repeated application of this identity gives the Euler expansion 
(y — a) x 
(1 + x)? 
v(y + 1) (1 + x)’ 


F(a, 1,7; aed x) 


We now propose to obtain the analogous developments for the general 
continued fraction of this form. For simplicity, let g, be real and 0<g, <1, 
and put 


f(x) = 1+ gix/1 + (1 — gi)gax/1 + (1 — 


so that 
(1+ x)/f(~) = 1+ (1 — gi)x/1 + gi(1 — ge)x/1 + — gs)x/1+---. 


Let fi(x) =1+91(1 —ge)x/1+g2(1—gs)x/1+ ---,and denote by A,(x) the nu- 
merator of the mth approximant of this continued fraction. Then it is easy 
to prove by mathematical induction that 
Ana(— 1) = 

where 

n 1— (1 — 

> ( 81)( 82) ( 49) 

We may therefore apply Theorem 2.3 with c=1 and obtain 


(1) 


where =(1—gn41) n= 1, 2, 3, 
(So=1). Consequently 0<g” <1, so that the continued fraction for f,(x) has pre- 
cisely the same form as that for f(x). Hence we may write 

1+ x (1 — gi )x gs (1—g3 

f(x) 1 + 1 + 1 + 


172 


1940) - CONTINUED FRACTIONS 173 


put fo(x) =1+g}" ---, and apply the above 
argument to the latter continued fraction. In this manner we arrive at the 
following theorem: 


THEOREM 5.1. Let f(x)=1+g:%/1+(1—g1)gex/1+(1—ge)gsx/1+---, 
where 0<g,<1. Define numbers g™’ by the relations 


(0) (m (m—1), 
m,k=1,2,3,---: 
(k—1) (k—1) (k—1) 


) 


> 
j=1 


so that <1; and put (1 
Then the functions fo=f, fi, fo, fs, satisfy the following identities: 


=0,1,2,--- , and consequently 1/f(x) has the Euler expansion 


1 1 x? 


= —— + (1 — + (1 ) 


fle) (1+ 2)? 
CoROLLaRY 5.1. Let and co/f(x) ---. Then 


(m—1) 


(5.4) = co(1 — gi) (1— gi”). 
COROLLARY 5.2. The correspondence 


m (m) (m) 
Ac f 2 (1 — gi 


(5.5 


+ 


x 


— + A™ox? — 


ts valid form=0,1,2,--- 


From (5.4) it follows that if co>0 then A"co>0 for m=1, 2, 3,---. We 
shall show next that Ac, >0 for all m, n. To do this it will suffice to prove 


THEOREM 5.2. If 0<g,<1 and 
Co — + — + gix/1 + (1 — gi)gex/1 


+ (1 — ge)gsx/1+---, 
then 


1 1 x 1 
— =—— + ) —— 
te i+ 2 


174 H. S. WALL . [September 


C1 — Cox + — + + (1 — hex/1 


where 0<h, <1. 


Proof. With the aid of Stieltjes integrals and the results of §4, one may 
prove this theorem in a few lines. The proof may be made by means of the 
continued fraction algorithm alone by means of the four lemmas which follow. 


LEMMA 5.1. The continued fraction('°) 


Un—10n 


uy + — 


in which u;~0,1=1, 2, 3,---,m, is equal to 


uy 


1+ 


v Ved Un 
This may be readily proved by mathematical induction. 
LemMaA 5.2. Ifa,>0and 
Co — + Cox? — + aex/1 + a3x/1+--- 
then(#") 
C1 — Cox + — + box/1 + b3x/1+--- 
where 
(5.6) = be = a2 + 3, = Gants + Gente — Danze, 


(5.7) Benge = Gongs + 


Gen+1 Gen+142n-1 
1 + + 


den Ae 


Proof. It is well known that c,—cex+c;x?— --- has a corresponding 
continued fraction of the form specified. Moreover, the “odd part” of 
a:/1+a2x/1+a;x/1+.--- must be the same as the “even part” of 
bi/1+bex/1+b3x/1+ ---. That is, 


1+ (atads — 14+ latals — 
bix bobsx? babsx? 


O. Szész, Ueber die Erhaltung der Konvergenz unendlicher Kettenbriiche - - +, Journal 
fiir die reine und angewandte Mathematik, vol. 147 (1916), pp. 132-160. 
(“) This holds for any a,’s such that there is no division by zero in the formulas. 


a 


(5.8) 


= Co 


. 


1940] CONTINUED FRACTIONS 175 


On equating corresponding elements in these continued fractions we obtain 
(5.6) and the relation benben41 =2n4102n42. Hence, on combining this with (5.6), 


Gen—142n 
= Gents + — 
t Gen — Gan-1 + Gon—-2 —°°* — G2 + 


which, by Lemma 5.1, reduces to (5.7). 


5.3. If in Lemma 5.2 the a,’s have the form a2 = 1, On = Zn—1(1 —Qn-2), 
n>2, where 0<g,<1, then0<b, <1, 22. 


Proof. We have 
-(1—gen42) — <1. Hence also, 0<den41<1. 


LemMA 5.4. Under the hypothesis of Lemma 5.3 the numerators of the ap- 


proximants of the continued fraction 1+bex/1+b3x/1+bix/1+--- are all 
positive for x= —1. 


Proof. Let A,(x), »=0, 1, 2,---, be the numerators in question, and 
let C,(x) be the denominator of the mth approximant of 


(5.9) + aex/1 + agx/1+---. 


Then by (5.8), Cons1(x) =Aon(x). Now | Co(x)| =|1+g:x| = (1—g1)| Ci(x)], if 
|x| <1. If, for any value of , |x| <1 implies that 


| Ca(x) | S (1 — | | 2 (1 — gi)(1 — ge) (1 — gna), 


then | Cuyi(x)| =| Ca(x) 2| Ca(x)| —gn(1 Coa(x) | 
if | x| Hence(??) 


| | = (1 — gn) | Ca(x)| = (1 — — ge) (1 — 


Inasmuch as C,(0)=1, we must therefore have Cons:1(—1) =A2n(—1)>0. 
Now Aony2(x) Hence if Aenyi(x) =0 for —1Sx<0, 
then Aeni2(x) and Ae,(x) would have opposite signs for this value of x. Since 
this is impossible, it follows that Aen4:1(—1)>0. 

The proof of Theorem 5.2 may now be readily made. By Lemma 5.4 we 
may apply Theorem 2.3 with c=1 and obtain =hy, n>2, 
where But by Lemma 5.3, 
0 <1, m>2. Consequently 0<h, <1, n21. 


THEOREM 5.3. If co>0, and 
Co — + — + gix/1 + (1 — gi)gex/1 
+ (1 — ge)gsx/1+---, 


where 0<g,<1, then 


(2) This induction was used by Van Vleck, loc. cit. 


| 
| 
| 
| 

i 


176 H. S. WALL [September 
(5.10) Ac, > 0, m,n =0Q,1,2,--- 
and 


Ac, Ate, 
(5.11) a< < < n=0,1,2,-- 
Cn Acn A*c,, 
Proof. The inequalities (5.10) follow from (5.4) and Theorem 5.2. It will 
suffice to prove (5.11) for the case n=0. By (5.4) we have 
(m) (m—1) (m—1) 


=(l1-g )=1-g, (1-—g: ) 


m=1, 2, 3, - - - , which was to be proved. 

6. Totally monotone sequences corresponding to a finite distribution of 
mass. Consider a terminating continued fraction of the form ¢o/1+g:x/1 
--- where co >0, 0O<g,<1,m=1, 2,3,---, 
k—1,0<g, <1. We know that if k =2m—1 this continued fraction represents 
a rational function of the form 


Aom(x) M;, + + Mn 
Bom() 1+ 1+ 2% 1+ 


where M;>0,i=1, 2,3,---,m,and0<x,<x2< --- <xmX1. On the other 
hand, if k =2m, the function represented has the form 


Aom+1(%) M, M2 Ma 


where M;>0, 1, 2,---, m, and --+ Naturally 
the M,’s, x;’s are not the same in the two cases. In either case if co—cix 
+0c2x?— --- is the corresponding power series, then 


i=m 
= >. Mi, n=0,1,2,---, 

where xo =0, x»=1, and M, is positive or 0 according as k=2m or 2m—1. 
Thus we may write c, = fiurdp(u), n=0,1,2,---, where d(u) isa monotone 

nondecreasing function with but a finite number of points of increase. 

Conversely, let co, c1, c2, -- - be a totally monotone sequence correspond- 
ing to a finite distribution of mass. Then c, has a representation of the form 


1 
(6.1) f u"do(u) , n=O0, 1, 2,°°° 
0 


A™c 
A™1¢ 


1940] CONTINUED FRACTIONS 177 


where ¢$(u) is a step-function. There are two cases to be considered according 
as $(u) is or is not continuous at u=0. 


LEMMA 6.1. Let Ax, B, denote the determinants |ci+;|, 7=0, 
1,---,m), respectively. Then if o(u) is continuous at u=O0, these determinants 
are positive for n<m, and if o(u) is discontinuous at u=0, A,>0 for nm, 
and B,>0 for n<m, where m is the number of values of u>0O where (u) is dis- 
continuous. 

Proof. Put 

i=m 
i=0 
where 0<x1<x2< <%mS1, Mi, Mo,---, Mm are positive, and x)=0, 
x§ =1. If ¢(u) is continuous at u=0, then M,)=0, while if ¢(u) is discontinu- 
ous at u=0, M,>0. Consider the quadratic form 


n 1 i=n 2 
LD = f wf dp(u). 
i,jm0 0 i=0 


If r=1 this is clearly positive definite for »<m, so that B,>0 for n<™m in 
both cases. When r=0, it is positive definite for n<m if $(u) is continuous 
at u=0, and for »<m if $(u) is discontinuous at u=0. Hence A,>0 in the 
first case for n<_m, and in the second for n<m. 

Now 


Me 
Mn S(x) 

1+ T(x) 


P(x) = — + Cox? — = 


where T(x) is a polynomial of degree m, and S(x) is of degree m or m—1 ac- 
cording as My)>0 or Mo=0. Put 


a = Co, den = = 
n=1, 2, 3,---,A-1=B_1=1, and form the continued fractions 


Aom(x) a 
(6.2) 
I+ 1 


and 


(6.3) 


according as My=0 or M,>0. The vanishing partial quotient is affixed for a 


| 
| 
| 
| 
| 
} | 
+ 1441 
| 
| 
it 
| 
| 


178 H. S. WALL [September 


reason to appear presently. On taking account of the degrees of numerators 
and denominators, and of the degree of approximation of these rational frac- 
tions to the power series P(x), we conclude that they are identical with 
S(x)/T (x) in the two cases. 

The denominators of the mth approximants of (6.2), (6.3) are greater than 
0 if —1<x<0, and n<2m, n<2m-+1, respectively. Hence we may apply 
Theorem 2.3 with c=1 to show that the a,’s have the form gn_1(1—gn-_2), 
n>2, d2=g:. We then apply Theorem 2.2 and obtain in the case of (6.2): 


Aco gi(1 — gi)x 
£2m—2(1 S2m—1) S2m—1% 
+ 1 


It is easy to see that $(u) is discontinuous at u =1 if and only if gom1=1. In 
case gem-1=1, the above continued fraction terminates with the (2m—2)th 
partial quotient, while if gon_1<1 it terminates with the 2mth partial quo- 
tient. 

Now Ac, = (u), where ¢1(u) —u)do(u). Hence ¢:(u) has the 
same number of discontinuities as, or a smaller number by one than, ¢(z), 
according as 1 is not or is, respectively, a point of discontinuity of @(u). Since 
¢1(u) is a function of the same character as ¢(u), we conclude that 1—gi>0, 
Zn-1(1—gan) >0, m=2, 3;-- +, 2m—2 if gom1r=1, and m=2, 3,---, 2m—1 if 
£em-1 <1. But we previously had gi>0, gn(1—Zn-1) >0, m=2, 3,---, 2m—1. 
Hence we conclude that 0<g,<1, m=1, 2, 3,---, 2m—2, 0<gom151. The 
treatment of (6.3) is exactly the same. We have therefore completed the proof 
of the following theorem: 


Aco — Acyx + Acex? — +--+ ~ 


THEOREM 6.1. If Co, C1, C2, - - - 18 a totally monotone sequence corresponding 
to a finite distribution of mass, then co—Cix+c2x?— --- is the constant co=0, 
or else 


Co — 61% + — + gix/1 + (1 — gidgex/1+--- 
+ — ges)x/1, 
where 0<gn<1,n<k, co>0. Conversely, any sequence determined in 


this way is a totally monotone sequence corresponding to a finite distribution of 
mass. 


7. The moment problem for the interval (— «, 1). The methods used pre- 
viously may be employed to prove the following theorem: 


THEOREM 7.1. If d(u) is a monotone nondecreasing function in the interval 
(— ©, 1), such that the moments c,=Jf_,u"dp(u), n=0, 1, 2,---, are all finite, 
and if the series Co—Cix-+C2x*— +--+ has a corresponding continued fraction, 


1940] CONTINUED FRACTIONS 


then this continued fraction must have the form 
(7.1) Co/1 + gix/1 + (1 — gi)gex/1 + (1 — ge)gsx/1+---, 
where co>0, and 


> 0, (1 S2n—1) (1 Z2n—2) > 0, 
n=1,2,3,--+, go=0. 


Proof. Let b:/1+2x/1+);x/1+ --- be the corresponding continued frac- 
tion which is supposed to exist. Then it is known(!*) that b:>0, Denbenyi >0, 
n=1, 2, 3,---.The zeros of the denominators B,(x) of the approximants of 
this continued fraction are all real and none of them lie in the interval ('*) 
—1<x<0. Hence B,(—1)>0. By Theorem 2.3 we may then write this con- 
tinued fraction in the form (7.1) where g,#0, 1, #21. Then by Theorem 2.2 
we have 


Aco gi(1 — — 
1 + 1 + 1 + 


Arc, = J’ where ¢:(u) 
is monotone nondecreasing for — © <u <1, and consequently Aco =¢o(1 —g:) 
>0, gen—1Z2n(1 — gen) (1 —gen41) >0, 21. The theorem now follows. We shall 
leave unanswered the question of the converse of this theorem. 

8. Moment generating functions which are bounded in the unit circle. 
The problem of this section is to specialize the continued fractions of 
Theorems 4.1 and 6.1 in such a way that the moment generating function 
f(x) =co—aax+ce2x?— --- will be bounded in the unit circle, that is 


M(f) = lub. | f(x) | 
|z|<1 


Aco — Acyx + Acex? — 


will be finite. Our first result is contained in the theorem which follows. 


THEOREM 8.1. The function f(x) =co—cix+cox?— --- is inthe class E, of 
moment generating functions bounded in the unit circle, if and only if there is a 
correspondence of the form 


hf(x) ~ g:/1 + (1 — gi)gox/1 + (1 — ge)gsx/i+---, 


for some sufficiently small positive number h and for real g,’s such that 0Sg, 31, 
n=1, 2, 3,---, where it is agreed that the continued fraction shall terminate 
with the first identically vanishing partial quotient. When the condition 1s satis- 
fied, M(f) S1/h, and h can be taken equal to 1 if and only if M(f) $1. 


Proof of su:‘iciency. Suppose first that the continued fraction terminates 


(8) H. Ham'ourger, Mathematische Annalen, vol. 81 (1920), pp. 235-319, and vol. 82 
(1921), pp. 120-137. 
(#4) Cf. footnote 8. 


179 
| 
| 
| 
q 
id 


180 H. S. WALL ; [September 


and that Af(x)=g:, OS giS1, or hf(x) =gi1/1+(1—gi)gex/1+ --- +(1—gz) 
n=1, 2, The first possibility 
is at once disposed of. When gi41<1, the second possibility is disposed of by 
Theorem 3.1; and when g;.4:=1, it is required to be shown that the rational 
function hf(x) has no poles for |x| <1. To do this, recall that the function 
1/(1+xhf(x)) has the form 


Mo M, 4 M m-1 4 Mn 
1+ 1+ 1+ 1+ 


where %»=0<x1<x2< Mo20, M;>0,7=1, 2, --+-,m,and that 
in the case under consideration x,, = 1. Consequently, the zero of this function 
which lies nearest the origin is less than —1, so that the function 1+xhf(x) 
has no pole for |x| <1 and therefore M(f) is finite. It will be seen that 
f(-1) =1/h = M(f). 

When the continued fraction does not terminate, then by Theorem 3.1, 
M(f)<1/h, and f(x) E. 

Proof of necessity. Suppose conversely that f(x) e« E, and let f(x) =co/1 
where co20, OSh, 51, n21. 
If co=0 or 2,=0, the theorem is obviously true. Suppose co>0, 0<A, <1, 
n=1,2,---,k—1,0<h;,31, and that the continued fraction terminates by 
having =0. Then we must have <1, for otherwise f(x) has a 
pole at x= —1. Now consider, whether or not the continued fraction termi- 
nates, the function 


1/(1 + xhf(x)) = 1/1 + heox/1 + + (1 — 
+ (1 — 


Since the continued fraction 
+(1—he)hsx/1+ --- converges uniformly for | x| <1, by Theorem 3.1, and 
cannot vanish for | x| <1since M(f) < © by hypothesis. It readily follows that 
h>0 can be so chosen (4=1 if M(f) 1), that for every r, 0<r<1, the con- 
tinued fraction for 1/(1+xhf(x)) converges uniformly for | x| <r, and conse- 
quently we must have, for the chosen value of h, 


1/(1 + xhf(x)) = f dé(u)/(1 + xu), 


where ¢(u) is some monotone nondecreasing function. Hence 1/(1+xhf(x)) 
is a moment generating function, and therefore has a representation of the 
form ---, so that hf(x) has a rep- 
resentation as prescribed by the theorem. 

Another characterization of E in terms of continued fractions is given by 
the theorem which follows: 


1940] CONTINUED FRACTIONS 181 


THEOREM 8.2. Given a moment generating function f(x) =co/1+g:x/1 
+(1—gi)gex/1+ --- (co2z0, OSg, f(x) E if and only if f(x)=coz0, 
or f(x) --- where co>0, 
O<gn<i1, m=1, 2, 3,---, k—-1; or else f(x) 
+(1—ge)gsx/1+ --- , where co>0,0<g,<1, m21, and the series 


£182°°* gn 
8.1 1 


converges. 


Proof. The case of the terminating continued fraction is easily disposed of, 
for when the condition is satisfied the rational function has no poles for 
| x| <1, while if g,_;=1 (the only alternative) there is a pole at x= —1. 

In the case of the nonterminating continued fraction, the function 
Co/f (x) --- is analytic for | x| <1 
and continuous for | x| <1. Also by Theorem 2.1 the function (1+)f(x) en- 
joys these same properties. Thus f(x) is continuous for |x| <1 except possibly 
at x=—1. But by Theorem 3.1 f(—1)/co is equal to the series (8.1) and is 
therefore finite if and only if the latter converges. 

9. A characterization of E in terms of the moments c,. We now prove 


THEOREM 9.1. A moment generating function f(x) =co—ax+ex?— --- is 
in E if and only if the series co+-cit+c2+ --- converges, and when the condition 
is satisfied we have 


M(f)=cotatat- 


Proof. Sufficiency: When the series co+citce+--- converges, the se- 
quence 1, 2,---, is totally monotone, so 
that the power series d)—dix+d2x?— --- has a corresponding continued 
fraction of the form 
(dg=0, OSh, 51). Then, since Ad, =c,, it follows from Theorem 2.2 that 
put g,=1—h,, n=1, 2,3, ---. That f(x) ¢ now follows from Theorem 8.1. 

Necessity: The necessity of the condition results at once from a theorem of 
Abel. However, it is interesting to give a direct proof. Suppose then that 
f(x) e E. Then by Theorem 8.1 we must have Af(x)~g:/1+(1—g1)gox/1 
+(1—ge)gsx/1+ --- for some h>0, where the g,’s are restricted as in Theo- 
rem 8.1. Therefore 1+hxf(x)~1+g:%+ ---,s0 
that, by Theorem 2.1, 


1+ hxf(x) 


(1—gi)x gi(1 — 


1 
1 


Consequently the function (1+hxf(x))/(1+x) is a moment generating func- 


| 
| 
- 
| 
| 
| 
i 
| 


182 H. S. WALL : [September 


tion. Its power series expansion is +--+, sS0 
that 1—A(cotcitecet+ --- +c,)20 for n=0, 1, 2,---. It follows that the 
series is convergent. 
Since f(—1) =co+ta+e2+ ---, it follows that M(f) is equal to this sum. 
As a corollary to Theorem 9.1 we have 


THEOREM 9.2. The moment generating function f(x) is in E if and only if it 
has a Stieltjes integral representation of the form 


(1 — u)do(u) 
0 1+ xu 
in which o(u) is bounded and monotone nondecreasing. 


Proof. If f(x) « E, then the series }.c; converges. Let d, be defined as in 
the proof of Theorem 9.1. Then the function dy) —dix+d.x?— --- hasarepre- 
sentation of the form / 5d (u) /(1+xu), where $(u) is bounded and monotone 
nondecreasing. Since we must have f(x) =fi(1 —u)do(u)/(1+xu), the neces- 
sity of the condition is proved. Conversely, if the condition is fulfilled, put 
do—dix+dex?— --- =f\db(u)/(1+xu), where is given, Then c,=Ad,, 
so that }\c;=do—limnn» dn is convergent, and consequently f(x) e E. 

10. The algorithm of Schur. Starting with a function f(x) of E for which 
M(f) <1, we construct with Schur("*) a sequence of functions f,(x) in the fol- 
lowing way. Put fo(x) =f(x), 


tn — fa(x) 


f(x) = 


1 
(10.1) Sno i(x) = 


t, = f,(0), 


If M(f)=1, then M(f,) =1, and if M(f) <1, then M(f,) <1, (n=1, 2,3,---). 
If f(x) =gi, then f, (x) =0 for n=1, 2, 3,---. If f(x) =g:1/1+(1—gi)gex/1, 
then f,(x) where 


ql) (1) 
= ge = gig2/(1+ gi — gigs), 


so that if 0<gi<1, 0<geS1, then <1, 0 <g{ <1. We shall prove this 
theorem: 


THEOREM 10.1. If 
n=1, 2, 3,---+, with the agreement that the continued fraction shall terminate 
with the first identically vanishing partial numerator, so that f(x) is amoment © 
generating function and M(f) <1, then the functions f;(x), fo(x), fs(x), --- given 
by (10.1) are all moment generating functions and M(f,) <1. The only case 
where any of these functions are constants is where f(x) =gi, whereupon f,,(x) =0, 
n=1,2,3,-°°. 


Proof. By (10.1), if f(x) is not a constant, then 
(35) Cf. footnote 4, 


E 


CONTINUED FRACTIONS 


(1 — gi) 


= g1 — g:/1 + go(1 — + — ge)x/1+---, 


where we have put 41=gi/(1+ 41). This function is evi- 
dently of the form gi—g(x), where g(x) is a moment generating function and 
g(0)=g:. Put g(x) =gi—dox+dix?— ---, and we find that fi(x) =d)—d.x 
+d2x*— --- and is therefore a moment generating function. On account 
of the character of the transformation used, M(fi) <1. Consequently, by 
Theorem 8.1, fi(x) gx/1+ where 
<1, =1, 2,3, --- , with the oft repeated convention regarding termi- 
nation of the continued fraction. 

In the same way, starting with f:(x) instead of with f(x), we find that 
f2(x) is a moment generating function, and M/(f2)<1; and by induction, 
f(x), fa(x), -- - all have this property. 

We saw previously that when f(x) =g:, then f,(x) =0, (n=1, 2, 3,--- ). 
It remains to be shown that this is the only case where any of the functions 
can reduce to a constant. To do this, it suffices to show that if fi(x)=c, a 
constant, then c=0. We have 


81 — Cx 2 
f(x) = ——— = gr — (1 — gidex — gic (1 — gi)x —- 


1 — cxgi 


Since this is a moment generating function, we must have —gic?(1—g{) 20, 
which implies that c=0, or else g:=0 or 1. But when gi=0 or 1, we must 
have f =g:, and consequently f/f; =¢ =0 in this case also. 

As a corollary we have 


THEOREM 10.2. If f(x) is a moment generating function for which M(f) <1, 
then the sequence to, ti, te, -- - given by (10.1) has the property of a totally mono- 
tone sequence that if any member is 0 the others are also 0 with the possible ex- 
ception of the first. 


In a number of examples which we have examined, the sequence {tn} has 
been found to be totally monotone. Technical difficulties have thus far pre- 
vented us from determining whether or not this is always the case. Also, the 
question as to the converse of this naturally arises, namely, if {tn} is a totally 
monotone sequence, will the function f(x) which this sequence determines be 
a moment generating function with M(f) <1? 

In conclusion we shall give recursion formulas for computing the ?¢,’s in 
terms of the g,’s. From (10.2) we have 


1 1 1 + 


1940] 183 
i} 


184 H. S. WALL 


which is equal to g«/1+ ---. On equat- 
ing the odd part of the first of these continued fractions to the even part of 
the second we obtain at once the formulas 


(1) (1) (1) 
gi = gig2/(1 + gi), ge (1—gi )= 
1+ £1 


(10.3) a) 


+ gs(1 — ge), 


(1) (1) 
— gn—i)(1 — gn ) = — 8n)(1 — gn+i), 


qa) q) q) q) 
— gn ) + — = — gnti) + — 


n=2, 4, 6,---. By means of these relations one may show that the se- 
quence t,=g™, n=0, 1, 2,--~- (g©’=g:) is monotone decreasing. In fact, 
tn 

Using a result of Schur(**) one may obtain formulas for the ¢,’s in terms of 
the g,’s. To do this, let the mth approximant of the continued fraction for 
f(x) be 
An(%) do + + aex? +--+ + 


B(x) bo + bie + box? + b,x" 


This rational function is a moment generating function, and its modulus is 
less than or equal to 1 for | x| <1 provided M(f) <1. Moreover, the sequence 
{t;} for this function will be in agreement with that for f(x) up to and in- 
cluding the term of index n—1. Moreover, the a,’s and b;’s can be computed 
in terms of the g,’s. Using the notation of Schur, we then put 


bo 0 ao Gm-2 


Gm—3 


Gm—1 0 0---0 

where a;=0 if i>k and b;=0 if i>r. A formula of Schur(!”) then gives for ¢; 


which is valid for 1, 2,---,2—1. 


(#6) Schur, loc. cit. (first part), pp. 213-215. 
(27) Loc. cit., p. 215, formula (12). 


NORTHWESTERN UNIVERSITY, 
Evanston, ILL. 


5 bm—-1 bo 0 0---0 ao 
| 


HAUSDORFF METHODS OF SUMMATION AND 
CONTINUED FRACTIONS 


BY 
H. L. GARABEDIAN AND H. S. WALL 


1. Introduction. We shall be occupied in this paper with a special study of 
the transformation 


(1.1) ba = m,n = 0, ’ 
n=0 

where {sa} is an infinite sequence of numbers. If, in particular, we are con- 
cerned with the infinite series then s, =) otty. Let A= (ann) repre- 
sent the triangular matrix of the transformation (1.1). Then, the sequence 
{tm} is called the transform of {s,} by the matrix % and is ‘represented in 
symbolic form by tm =%{s,}. If the matrix % is regular, in the sense of Silver- 
man [1] and Toeplitz [2], then the matrix %& defines a regular method of 
summation (summability). 

In this paper our attention is focused on a class of regular and permutable 
matrices {%}, known as Hausdorff matrices ([3] or [4]), which we shall pres- 
ently define. Let {c,} be an infinite sequence of numbers defined by the 
Stieltjes integrals 


1 
0 


Then, we form the matrix D(dmaCm)D, where bn,=0 for m¥n, dmm=1, and 
D=((—1)*C,n,.). Incidentally, the matrix D has the property D?= %, where 
% is the identity matrix. The following conditions on the mass function $(u) 
are necessary and sufficient in order that the matrix D(dnnacm)D be regular: 


(1.3) (i) @(u) is of bounded variation on the closed interval (0, 1); 
(ii) p(u) ts continuous at u=0 and o(u) =0; 
(iii) $(1) =1; 
(iv) $(u) =3[¢(u—0)+¢(u+0)], 0<u<1. 
If ¢(u) satisfies the conditions (1.3), then the matrix D(dnnCm)D is a Haus- 


dorff matrix and the sequence (1.2) is known as a regular sequence or a regular 
moment sequence. If X=D(SmnaCm)D Hausdorff has proved that we can write 


Presented to the Society, April 13, 1940; received by the editors February 2, 1940, and, 
in revised form, February 24, 1940. 
185 


| 


186 H. L. GARABEDIAN AND H. S. WALL [September 


We shall be concerned for the most part with real totally monotone se- 
quences {c,} which are characterized in any one of the following three ways: 


(1.5) (i) A"c, 20, (m, n=0,1,2,--- ); 
(ii) there exists (essentially uniquely(!)) a real monotone non-decreasing 
¢(u) such that Cn = (n=0,1,2,---); 
(iii) there is a correspondence(?) of the form 
1+ 1 + 1 + 
where co20, OX g, Si, (n=1, 2, 3, - - - ), and where it is understood that the 


continued fraction shall terminate with the first partial quotient which van- 
ishes identically. 


Co — + cox? — 


These characterizations are due respectively to I. Schur, F. Hausdorff [5], 
and H. S. Wall [6]. 

In §2 of this paper we give necessary and sufficient conditions for the regu- 
larity of a totally monotone sequence in terms of the corresponding continued 
fraction. In §3 we investigate certain properties of the difference matrix 
A=(A"c,). Any row, column, or diagonal sequence of the difference matrix 
is found to be totally monotone. We obtain necessary and sufficient condi- 
tions, in terms of the continued fraction corresponding to the base sequence, 
that is, the sequence defined by the first row of A, in order that the row, col- 
umn, and diagonal sequences be regular. This is accomplished with the aid 
of the following curious result: if (1.5, iii) obtains, then the power series 
co—Acox+Acox?— ---, with coefficients in the first column of the difference 
matrix A, corresponds to the continued fraction 
(1— gi) (1 — ge)(1 — gs)x 

1+ 1.4 1 4 1 + 1 + 
obtained from the continued fraction of (1.5, iii) by replacing gon. by 1—gen-1, 
(n=1, 2, 3, --- ). Section 4 is devoted to an example which illustrates some 
of the results of the two preceding sections. 

The remainder of this paper has to do with special Hausdorff methods of 
summation. In §5 the continued fraction of Gauss is employed to obtain a 
regular sequence which results in the definition of hypergeometric summability. 
Numerous inclusion and equivalence relations between the hypergeometric» 
methods are derived. In §6 we replace the base sequence of A by known se- 


(‘) This means that ¢(u) exists uniquely except for an additive constant at all points of 
continuity. 

(?) We use the symbol ~ between a power series and a continued fraction to indicate that 
the power series expansion of the mth approximant of the continued fraction agrees term by 
term with the given power series for more and more terms as 7 is increased, or becomes identical 
with it from and after some n. 


i 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 187 


quences and discuss the new methods of summability thus generated and 
some of their properties. Finally, in §7, we discuss the effectiveness of meth- 
ods of summation associated with regular totally monotone sequences relative 
to the analytic continuation of power series outside the circle of convergence. 

2. The Stieltjes continued fraction as a tool in the theory of summability. 
In his celebrated memoir Recherches sur les fractions continues, Stieltjes [7] 
showed that a real sequence {c,} is a moment sequence for an infinite dis- 
tribution of mass along the positive half of the real axis if and only if the 
power series ¢o—cix-+c.x?— --- has a corresponding continued fraction of 
the form b,/1+dax/1+-b3x/1+ - - - , where the b,’s are real and positive. The 
continued fraction is uniquely determined by the moment sequence, and con- 
versely there is uniquely determined a moment sequence by means of a con- 
tinued fraction of the specified form. Moreover, the question as to the unique- 
ness of the distribution of mass for a given moment sequence can always be 
decided when the continued fraction is known. 

When the continued fraction converges, Stieltjes showed that the function 
represented has the form 


do(u) 
o 1+ 


where ¢(u) is a bounded monotone non-decreasing function, and represents 
the distribution of mass in accordance with the relations c,a=/¢u"dd(u), 
(n=0, 1,2, --- ), the sequence {cn} defining the given moments. In this case 
the moment problem is determinate, that is, there is but one possible distribu- 
tion of mass. On the other hand, if the continued fraction diverges, the se- 
quences of even and odd approximants converge to separate integrals of the 
above form, which determine two distinct solutions of the moment problem. 
In this, the indeterminate case, there are infinitely many distributions of mass 
for the given moments. 

Until recently it was not known how to specialize the numbers 5, in the 
continued fraction in the case of moments for a distribution of mass over the 
finite interval (0, 1). The answer to this question is contained in the state- 
ment (1.5, iii), where, in the case of the terminating continued fraction, the 
moments are for a finite distribution of mass. 

The moment problem for the interval (0, 1) was solved without the use 
of continued fractions by Hausdorff [3]. He found, among other things, that 
these sequences are regular (*) if and only if the mass function $(x) of (1.5, ii) 
is continuous at u =0, and /jd@(u) =co=1. One of the ways in which the con- 


(*) It is sometimes convenient to state the regularity conditions (1.3) in this form. The 
requirements ¢(0) =0, ¢(1) =1 can be replaced by the single condition Sido(u) =1. In the dis- 
cussion which follows we shall always assume without further mention that this requirement is 
met. The regularity condition (1.3, iv) is in a sense superfluous since it serves merely to deter- 
mine ¢(u) uniquely at every point of the interval (0, 1). 


j 

| 


188 H. L. GARABEDIAN AND H. S. WALL [September 


tinued fraction may serve as a tool in this theory is in establishing the con- 
tinuity or the discontinuity of @(u) at u=0. 

We shall begin by disposing of the case where the continued fraction (1.5, 
iii) terminates, the case corresponding to a finite distribution of mass. Here 
the function ¢() is a step function with but a finite number of discontinui- 
ties, one at each point where a quantity of mass is concentrated. There is no 
discontinuity at u=0 tf and only if for some index n the first 2n partial quotients 
of the continued fraction are not identically zero while the next partial quotient 
vanishes identically. This [6] is a consequence of the fact that in this case, 
and only this, the continued fraction may be written as a sum of partial 
fractions of the form 21M s/(1+xx;), without a constant term, where M;>0, 
+ The moments are then c,, M;,, the function 
¢(u) having discontinuities only at the points x;. 

When the continued fraction does not terminate, it may always be written 
in the form 1/a1+x/a2+x/a3+ - - -, where the a,’s are positive. Then from 
the work of Stieltjes [8, p. 510] we have the theorem that ¢(«) is continuous 4 
at u=0 if and only if the series > aen_1 diverges. Since 


a, = 1/co, a3 = 81/Coga(1 
Gongs = [gigs gar—1(1 — go)(1 — ga) --- (1 — gens) ] 
+ [cogags- gen(1 — gi)(1 — gs) --- (1 — gen-s)], 


this condition appears at once in terms of the parameters gn. Remembering 
that in addition to the continuity of ¢(u) at «=0 we require for regularity 
that co=1, we have the following theorem. 


THEOREM 2.1. The totally monotone sequence {cn} is regular if and only if 
the power series Co—Cix+cox?— has a terminating corresponding continued 
fraction of the form 


1 gix (1 — gi)gex (1 — 
(2.1) 
1+ 4+ 1 1 
where 0<g.<1, (R=1, 2, 3,---, 2m—2), O< gens: S51 (in which case the dis- 


tribution of mass is finite), or else has a nonterminating continued fraction of the 


(1 — gi)gex (1 — ge)gsx 
where 0<g,<1, (n=1, 2,3,--- ), and the series 


8284" — gi)(1 — gs) (1 — 


diverges (in which case there is an infinite distribution of mass). 


(2.3) 


| 
form 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 189 


We shall record here for future reference some known properties of the 
special Stieltjes continued fraction with which we are concerned. These will 
be stated in the form of theorems, with adequate references. 


THEOREM 2.2 [6, p. 166]. Jf g:, ge, gs, - - - are any real or complex numbers, 
and P(x) is a power series in ascending powers of x such that 


P(x) ~~ 1+ gix/1 + (1 — gi)gox/1 + (1 — ge)gsx/1+---, 
then 
(1 + «)/P(x) ~ 1+ (1 — + gi(1 — ge)x/1 + go(1 — gs)x/1+---. 


If coX0 and we put ¢o/P(x)=co—cix+cox?— --- , this statement takes the fol- 
lowing form: if — 
+--+, then - - - ~Aco/1+g1(1 —g2)x/1+g2(1 —gs)x/1 


THEOREM 2.3 [9, p. 159]. If g1, ge, gs, --- are real, O<gi<1, OSg,<1, 
n>1, then the continued fraction 


(2.4) + (1 — gidgex/1 + (1 — ge)gsx/1+--- 


converges uniformly for | x| <1. The function f(x) represented is continuous for 
|x| <1, analytic for |x| <1, and its modulus for |x| <1 does not exceed 


8182" * Sn 


(1 — g:)(1 — ge) - (1 — gn) 


This is the least upper bound, since it is assumed by f(x) at x= —1. 

The continued fraction converges uniformly over every bounded closed region 
containing no real point x which is less than or equal to —1, and f(x) is analytic 
in every such region [8]. 


3. The difference matrix. Let {cn} be a totally monotone sequence. Then 
1 
C. = f u"do(u) 
6 


where ¢(u) is monotone non-decreasing. Now, the mth difference of c, is given 
by the formula 


1 
-f (1 — u)™u"do(u), m,n =0,1,2,---. 
0 


If we keep m fixed and allow to vary, it is evident that the resulting se- 
quence (Aco, A"c2, - - ) is totally monotone, the mass function being 
J¢(1—t)™do(t). Moreover, if we keep fixed and allow m to vary we obtain the 
sequence (Cn, - - ). Inasmuch as = 
it is clear that this sequence is totally monotone, the mass function being 


| 
| 


190 H. L. GARABEDIAN AND H. S. WALL [September 


S¢[—(1—-t)"]d(1 —t). Thus, the row and column sequences of the difference 
matrix A are all totally monotone. 
Consider next a diagonal sequence 


(3. §) Cn, * °° 
or 
(3.2) A"co, A*tte,, 


We may write 


1/4 1/4 
A* = f u*d¢'(u), A*tkc, f u*do’’(u), 
0 0 


where 
do'(u) = — d¢""(u) = — uido(ue), 


and 1 is the smallest and ue the largest of the roots of the quadratic equation 
u?—u+v=0, 0Sv<}. Accordingly we observe that the sequences (3.1) and 
(3.2) are totally monotone. 

An inspection of the mass functions for these row, column, and diagonal 
sequences reveals that they can all be made regular, by dividing all members 
of each by its first member, if and only if ¢(u) is continuous at u=0 and at 
u=1.In Theorem 2.1 we gave necessary and sufficient conditions for the con- 
tinuity at u=0, in terms of the continued fraction corresponding to the base 
sequence. We now propose to do likewise for the point u=1. 

The case of a finite distribution of mass is disposed of by means of the 
next theorem, which follows from the work of Wall [6]. 


THEOREM 3.1. Let {c,} bea totally monotone sequence corresponding to a finite 
distribution of mass, and let co/1+gix/1+(1—gi)gex/1 +--+ +(1—gn-1)gnx/1 
be the corresponding (necessarily terminating) continued fraction, where co>0, 
0<gi<i, (¢=1, 2, 3,---,nm—1), O<g,S1. Then the mass function $(u) is 
continuous at u=1 if and only tf gn<1. 


Turning to the case of an infinite distribution of mass, we consider the 
power series ¢o—Acox+A*cox?— --- with coefficients from the first column 
of the difference matrix. Let ¢o/1+hix/1+(1—M)hex/1+--- be the corre- - 
sponding continued fraction. Then it is apparent, in view of Theorem 2.1, 
that ¢(1—) will be continuous at u=0, that is, @(u) will be continuous at 
u=1, if and only if the series obtained from (2.3) by replacing ga by hy, 
(n=1, 2, 3,--- ), is divergent. The problem will therefore be solved if we 
determine the h,’s as functions of the g,’s. We shall prove that hen=gen, 
hen—1=1—gen-1, (n=1, 2, 3, - +--+ ). In fact, the following theorem provides a 
companion theorem to Theorem 2.2. 


| 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 191 


THEOREM 3.2. Let co be different from 0, and let gi, ge, gs, --- be arbitrary 
real or complex numbers. Then, if 


Co gix (1—gi)gex (1 — gedgsx 


we have 
Co — Acox + A*cox? — --- 
(3.4) (1—gi)x gigex (1 — ge)(1 — gs)x 
1 + 1 + 1 + 1 + 


where the second continued fraction is obtained from the first by replacing gon—1 
by 1 — gen—1, (n=1, 2, 3, ee ). 


Let 


Co — + 
1+ 1 + 1 + 1 + 


It is required to prove that a,=A"co. Replace x by —x, multiply by x, and 
then replace x by x/(1+<) in the last relation. We obtain 


+ 
Nites “See 


(3.6) 
Co (1—gi)x gigex (1 — ge)(1 — gs)x 


Now, the even part of the last continued fraction (that is, the continued frac- 
tion having as its sequence of approximants the even approximants of this 
continued fraction) is the same as the even part of the continued fraction 
of (3.3). It follows that the formal power series expansion of the left-hand 
member of (3.6) is co—cix+cox?— --- and hence that a, must equal Ac». 

With reference to the difference matrix A the principal result of this dis- 
cussion may conveniently be summarized in the following theorem. 


THEC «EM 3.3. Let {cn} be a totally monotone sequence corresponding to an 
infinite distribution of mass. Then the sequence (co, Aco, A®co, --- ), the first 
column of A, is regular if and only if co=1 and the series 
(1 — gi)(1 — ge) (1 — gens) 


(3.7) 


diverges, where the g,’s are related to the c,’s as in (1.5, iii). The sequence 


4 


192 H. L. GARABEDIAN AND H. S.*WALL [September 


(co, Aci, A®ce, - - + ), the principal diagonal of A, is regular if and only if co=1, 
and both of the series (2.3) and (3.7) are divergent. 


We have given conditions for regularity of certain moment sequences, 
chosen from the difference matrix, in terms of the continued fraction corre- 
sponding to the base sequence. The conditions can also be given in terms of 
the moment generating function 


do(u) 


In fact, Schoenberg [10] gave necessary and sufficient conditions, in terms of 
f(x), for the continuity of ¢(u) at an arbitrary point uw=?, OS¢<1. We may 
therefore state the thecrem which follows. 


THEOREM 3.4. A totally monotone sequence {c,} with corresponding mass 
function o(u) is regular if and only if co=1 and 


1d 
(3.8) lim f 
0 1 + xu 


where x—> along any ray except the negative half of the real axis. The sequence 
(co, Aco, Aco, - - - ) ts regular tf and only if co=0, and 


do(u) _ 


3.9 lim (1 + x = 0, 

0 1 + xU 
where x—>—1 through values interior to or upon the circle |x| =1. Finally, the 
sequence (co, Aci, A’ce, - - - ) ts regular if and only if co=1, and (3.8), (3.9) both 
hold. 


The regularity conditions can also be given in terms of the moments them- 
selves. Indeed, it is not difficult to show that lim,... ca=(1) —¢(1—0), and 
limn.. A"co=(+0) —¢(0). Hence we have the following theorem. 


THEOREM 3.5. The limits (3.8) and (3.9) in Theorem 3.4 may be replaced 
by the limits 
(3.10) lim Aco = 0, 


n— 
and lim,.. Cn=0, respectively. 


If the limit (3.10) is k>0, then k is the amount of the discontinuity of 
o(u) at u=0. If then we subtract k from ¢p in the sequence {cn} , the resulting 
sequence is totally monotone, and can be made regular by dividing every 
member by ¢o—k. 

Wall [6] considered the class of moment generating functions f(x) 
=)"f.¢:(—x)* in which the coefficients c, form a totally monotone sequence. 


| 
fe) = 
8 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 193 


In particular, he characterized in several ways the subclass of these func- 
tions which are bounded in the unit circle. It is of interest to apply Theorem 
3.2 to obtain results of this kind. 


THEOREM 3.6. If f(x)=co/1+g:%/1+ 
where co>0, 0<g,<1, n21, so that f(x) is a moment generating function corre- 
sponding to an infinite distribution of mass, then the modulus of the function 
(1+x)f(x) is bounded in the half-plane R(x) > —} if and only if the series 


(1— gi) (1 —gi)ge (1 — gidge(1 — gs) 
3.11 1 
g1 gi(1 — ge) gi(1 — gags 


4s convergent. 


If we put fi(x) =¢o—Acox+A%cox?— ---, then (1+x)f(x)=fi(w) where 
w=—x/(1+x). Now 8(x)>-—} if and only if |w| <1. But [6, p. 181] if 
fi(w) then the modulus 
of f,(w) is bounded for | w| <1 if and only if the series 


converges. Since by Theorem 3.2, h, equals g, or 1—g, according as m is even 
or odd, it will be seen that the latter series is the same as (3.11). 

One may prove that (1+)f(x) has its modulus bounded for R(x) > —4 
if and only if f(x) has a Stieltjes integral representation of the form 
Sjudd(u)/(1+2xu); and that the moduli of f(x) and of (1+x)f(x) are bounded 
in the unit circle, and in the half-plane R(x) > —4, respectively, if and only 
if the integral has the form fju(1—u)do(u)/(1+xu). 

4. An illustration. In this section we offer an example to illustrate some 
of the results of the two preceding sections. 

Let r be real, 0<r <1, and consider the function 


ri-r)x ri—r)x — 


1 


We shall determine the moments generated by this function and the corre- 
sponding mass function ¢(z). 

Put F(x)=1+9rxf(x), so that 
+.---. Then by Theorem 2.2 we have (1+x)/F(x)=1+(1—r)x/1 
+r(1—r?)x/1+1r2(1—r*)x/1+ --- so that F(x) satisfies the functional equa- 
tion 
(4.1) F(x) = 1 F(r*s) 

If we put f(x) =co—cix+cox?— - - - , we obtain quite readily the values of the 
moments c,, from this functional relation: 


f 
bei 


194 H. L. GARABEDIAN AND H. S. WALL [Septeraber 
(4.2) co=1, n=1,2,3,---. 


We obtain the following items of information about the corresponding 
mass function $(u). 

(a) Since lim,... 7*(1—7"—') =0, it follows from a result of Stieltjes [8, 
p. 560] that f(x) is a meromorphic function of x. Consequently $(u) is a step 
function (with infinitely many discontinuities). 

(b) Since lim c,=0, ¢(u) is continuous at u=1 by Theorem 3.5. 

(c) From a result of Wall [6, p. 172] there exist numbers gi, go, gs, °°, 
0<g,<1, such that gi=r?(1—r), gs(1—ge) =r*(1—7'), 

- ++, Then we find that the test-ratio for the series of Theorem 2.1 is 


S2n—1(1 — gens) — 
= 
Sen(1 — gona) — 1") 
which has the limit (1/r) >1 as n>. Since the series then diverges, we con- 
clude that ¢(u) is continuous at u=0. 


(d) In order to locate the discontinuities of ¢(u) we have to locate the 
poles of f(x). From (4.1) we have the formal expansion 


rx 4 


r” 


F(x) =1+ 


(4.3) 


The series on the right converges for all x* —1/r?", (n=1, 2, 3,---), and 
is uniformly convergent in any bounded region from which the interiors of 
small circles about these points have been removed. Let S, denote the sum 
of the first 7 terms of this series, Then, from (4.1), we have 


(1 + r?x)(1 + r4x) (1 + 


Now by Theorem 2.3 | F(x)| is bounded for | x| <1. Consequently 
lim,.. Sn= F(x) for | x| <1. It follows that (4.3) is a valid expansion of 
the function F(x). 

Since F(x)=1+1rxf(x), we have 


(1 + r2x)(1 + (1+ r2"x) 


for all x* —1/r?, —1/r*, ---, and the latter points are the poles of f(x). 
(e) From (d) it follows that f(x) must have an expansion of the form 


F(x) = Sy, F(e**z), 1, 2, 3,-- 


nal 1+ 


n 


| 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 195 


where M,>0, (n=1, 2, 3,--- ). Thus, the function ¢(u) will be completely 
determined at all points of continuity (except for an additive constant) if 
we know the values of the numbers M,. By (4.1) we have (1+72x)f(x) 
and therefore 


II (1 — 


Then, by this same relation, 


(4.4) n= 2,3,4,---. 


1 


This determines ¢(u) in accordance with the following table of values. 


Value of o() Value of u 
1 rsusi 


1— M, Su<r’ 


r 


(1 — r*)(1 — r4) 


5. Hypergeometric summability. The continued fraction of Gauss [11, p. 
348 | generates an interesting totally monotone sequence when the parameters 
are properly restricted. Thus, if we have given the special hypergeometric 
series 


1 


v(v + 1) 


we can obtain the representation 
F(a, 1,7, — x) = 1/1 + gix/1 + (1 — g1)g2x/1 + (1 — g2)gsx/1 + 
where 
n a+t+n-—1 


The g,’s will be real and will lie between 0 and 1 if and only if a and y are 
real, y>a>0. The moments are then 


1- (14 M, rsu<rt 
1—-({14+—— M, rsu <r 
1 — r? 
0 u=0 
| 


196 H. L. GARABEDIAN AND H. S. WALL [September 


_ aa + + 2) (ata— 1) 
vy + + 2)--- 


Since the hypergometric series converges for x=1, y>a, it follows that 
lim c,=0, and hence that the mass function ¢(u) is continuous at u=1. 
Moreover, it is easy to show that the series (2.3) diverges, and hence that 
¢(u) is continuous at u~=0. Accordingly, the moment sequence generated by 
this continued fraction is regular when a, y are real and y>a>0. 

We can determine ¢ (u) from the familiar Eulerian integral of the first 
kind. In fact, we have 


(5.1) co=1, Cy = 1, 2,3,--- 


1 do(u) 
F 1, =f 
0 1 + xU 
where 


I'(y) 
te-1(1 — 2) dt. 


The sequence (5.1) is a special case of the sequence of coefficients in the 
general hypergeometric series: 
co = 1, 
(S.2) a(a+ 1)(a+ 2)--- (a+m— 1)(8+2)--- (B+n—1) 
m=1,2,3,---. 


Accordingly, it is convenient to designate the method of summability defined 
by (5.2), when the sequence is regular, as hypergeometric summability. In this 
connection we use the symbol (H, a, 8, y), where in particular the sequence 
(5.1) defines summability (H, a, 1, y). 
By means of (1.4) the general term of the Hausdorff matrix associated 
with summability (H, a, 1, y) is readily found to be 


(5.3) y>a>Q0d0. 


Next we display some of the inclusion and equivalence relationships be- | 
tween the hypergeometric methods. Some of the symbolism is not readily in- 
telligible and will be explained presently. 


(5.4) (i) (H, a, 1, y) = (A, 1, a, vy), y > @; 
(ii) (H, a, y, y) = (MH, a, 1,1),0<a<1;y7>0; 
(iii) (H, 1,1, +1) = (C, vy), > 90; 
(iv) (H,a,1,y +1) ¢ (C,y),a>1; y>0; 1, 1, a); 
(v) (H,a,1,y+1) 3 (C,7),0<a<1;y7>0; (A,a, 1, 1); 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 197 


(vi) (H, a, 1,a+1) ¢ (C,a),a>1; 
(vii) a, 1,a+1) 3 (C,a),0<a <1; 
(viii) (H, a, 1,71) (H, a, 1, ¥2), ¥2 > > a > 0; (A, v1, 1, ¥2); 
(ix) (A, ae, 1, y) ¢ (A, am, 1, 7), a2 > > 0; y > 0; (A, au, 1, ae); 
(x) (C, a1) (C, a2), >a >— 1; (A, a1 +1, 1, ae +1); 
(xi) (H, a, 1,a@+1) = (H, B, 1,8 +1);a,8B >0; 
(xii) (C, 1) = (H,a,1,a+1),a>0; 
(xiii) (H, a1, 1, a1 + B) ~ (HA, ae, 1, ac +B); a1, ae, B > 0; | oy 
= 1,2,3,--:; 


(xiv) (H,a,ki + 1,7) > (H,a, ke + 1,7); hi, ke = 0,1,2,--+ he > hi; 
(H, ki +1, 1, ke +1); 

(xv) a, 1, y) > a, k +1, y), 
(H, 1,1, k +1). 


It should be mentioned that the matrix defined by (5.3) can be found in 
the literature on summability, but not in the same form. It was used by Cesaro 
in his celebrated theorem on the Cauchy product of two Cesaro summable 
series [12, p. 489], by Knopp in his proof of the equivalence of the (C, k) 
and (H, k) methods for positive integral values of k [12, p. 481], and by 
Hausdorff [3] in his proof of the same equivalence theorem. However, it has 
never before been associated with the sequence of coefficients in the hyper- 
geometric series, and most of the relationships in (5.4) are new. 

The identities (i) and (ii), and other related ones, obtain in virtue of the 
symmetric form of the sequence (5.2). The identity (iii) merely exhibits 
Cesaro summability as a special case of hypergeometric summability. 

The remaining relationships are proved with the aid of the following theo- 
rems of Hausdorff [4]. 


THEOREM 5.1. Necessary and sufficient conditions that a matrix 
Y= D(SmnCm)D 


be regular are 
(i) co = 1, 


(ii) > Cmn| A"-"cn| SM, m = 0, 1, 


n=0 


M independent of m, 
(iii) lim Cu,.A"-*c, = 0, n=0,1,2,--- 


m— 


THEOREM 5.2. Let AY=D(Smncd)D and B=D(imac®)D be regular matrices, 
and let B— exist. Then, a necessary and sufficient condition that UDB is that 
the matrix D(SmncA/c2)D be regular. 


| 
. 
| 


198 H. L. GARABEDIAN AND H. S.,WALL [September 


THEOREM 5.3. Let U and B be regular matrices, and let A-! and B exist. 
Then, necessary and sufficient conditions that A ~B are that the matrices 


D(Smn Cm/ cn)D D(Smn 


be regular. 


The relation (iv) is established with the aid of Theorem 5.2. Let {c,! } 
and { a } be the moment sequences associated respectively with the (C, y) 
and (H, a, 1,~+1) methods of summation. Then 


1:2---m that t)--- @+a- 


However, this defines the regular moment sequence associated with summa- 
bility (H, 1, 1, «), (a@>1). This completes the proof, and explains the append- 
age to statement (iv). The relations (v), (viii), (ix), and (x) are established 
with the same technique. 

The statement (v) is of particular interest due to the scarcity of methods 
of summation of the type (1.1) which include (C, y) summability. Indeed, 
we know of only two commonly known methods of the type (1.1) which have 
this property, the method of de la Vallée Poussin(*) and a method of 
M. Riesz(*). Hille and Tamarkin [15] give an interesting set of necessary 
and sufficient conditions in order that a Hausdorff method shall include 
(C, y). They are stated in detail for the case when y is an integer, and the 
conditions are easily handled. 

It is convenient at this time to define Riesz means of the first order and 
Nérlund means. We write 

+ + + PnSn 


(5.5) P. 


PnSo + Pn-151 + + PoSn 
P, 
where +hn, and >>p, in (5.5) is always divergent. - 


Since means of the type (5.5) were used in the early development of Riesz 
typical means, they are called Riesz means and are designated by (R, p,). 


(5.6) 


(*) That the method of de la Vallée Poussin includes (C, ~) summability was proved inde- 
pendently and virtually simultaneous!y by T. H. Gronwall [13, p. 1664], and C. N. Moore 
[14, p. 1774]. 

(5) It has been proved that the Riesz logarithmic mean of order y provides a method of 
summation which includes (C, y) summability [16]. 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 199 


Means of the type (5.6) are called Nérlund means and are designated by . 
(N, pn). 

The statements (vi) and (vii) are of course special cases of (iv) and (v) 
respectively. It is of interest that the transform (1.1) associated with 
(H, a, 1, a+1), a method of the Riesz type, contains the coefficients in the 
transform associated with (C, a), a method of the Nérlund type, written in 
the reverse order. 

The statement (x) is a classical result in the domain of Cesaro summa- 
bility. The proof is particularly easy to understand with the aid of the hyper- 
geometric notation. 

In order to prove (xi), (xii), and (xiii) we need the lemma which follows. 


Lemna 5.1. Let the sequences {ch} , be regular sequences. 
Then, if -¥_,di=1, the sequence {cn}, where 


k 
> dun, n= 0,1,2,---> 
t=1 


is also regular. 


The condition imposed on the c,’s insures that condition (i) of Theorem 5.1 
be fulfilled. The other conditions of the theorem will be fulfilled since the 
difference operation is linear. 

To start the proof of (xi), let {cx} and {8} be the moment sequences as- 
sociated respectively with the methods (H, a, 1, a+1) and (H, B, 1, 8+1). 
Then 


_a(B+n) a 
Blatn) 8B 
Let {c,’}=(1, 1, 1,---). Then 


. a,B>Q0. 


= + deen, 


where d:=8/a, dz= (a—8)/a, di+d:=1. Then, by Lemma 5.1, the sequence 
{cz/cB} is regular. Using Theorem 5.2 we now have (H, a, 1, a+1) 
> (H, B, 1, 8+1). By repetition of this argument we can also show that the 
sequence {b/c} is regular. Thus, we have (H, a, 1, a+1) ¢ (H, B, 1, 8+1). 
This proof is due to Hausdorff [3]. 

The relation (xii) is an interesting special case of (xi). We observe that 
summability (H, a, 1, a+1), (a>0), is essentially summability (R, *~), 
(a>0), and that the relationship (C, 1) n*-'), (a>0), can be proved in 
a completely different fashion [17]. 

Knopp [12, p. 481] has proved (xiii) and (viii) by very laborious methods 
for integral values of the parameters. These statements afford the entire basis 
for his Cesaro-Hélder equivalence proof. To establish (xiii) we assume that 


a 
Qa n 


200 H. L. GARABEDIAN AND H. S..WALL [September 


>a, and first prove that (H, a, 1, a1 +8)=(H, ai+1, 1, a1 +1+ 8). Let 
{o,f } and { ci’ } be the moment sequences associated respectively with these 
two methods. Then 


a1 


B+1 
a+B+1 a+tB+laj+n 


and is consequently a linear combination of regular sequences, where the con- 
stants of combination add up to unity. Likewise, as in the proof of (xi), 
{cn’ /ed } isa regular sequence. We use Theorem 5.3 to complete the first stage 
of the proof. Next we prove in the same fashion that (H, a:+1, 1, a1 +1+8) 
(H, ai1+2, 1,a:+2+ 8). Clearly, it can readily be established by induction 
that (H, a1, 1, a: +8)=+(H, ae, 1, a2 +8), provided that a; and ae differ by an 
integer. 

Up to the present time we have considered hypergometric summability 
(H, a, B, y), only for the case B=1, y>a. It is possible to find non-trivial 
regular hypergeometric methods at least for positive integral values of 8 by 
use of a known relationship [18, p. 233] among the hypergeometric functions. 
If we write 


n=0,1,2,---, 


aB a(a + 1)8(8 + 1) 


v(vy + 1)-1-2 


the series always converges and represents the function for |x| <1; a, 8, y>0. 
Now, we consider the identity 


(S.7) (8 — a)F(a, B, y, x) + aF(a + 1, B, y, x) — BF(a, 8+ 1, 7, x) = 0, 
|x| <1. 
Set B=1 in (5.7) and write 


F(a, 2, y, = (1 — a)F(a, 1, y, x) + aF(a + 1,1, 7, 2), <1. 
Equating coefficients of x” in this identity we obtain the relation 


(5.8) 


Now, the expression on the right is a linear combination of regular sequences, 
provided that y >a+1;a>0, where the constants of combination add up to 
unity. Consequently, the left member of (5.8) defines a regular method of 
summability, (H, a, 2, y), y>a+1;a>0. If we now use (5.7) as a recursion 
relation, we can define summability (H, a, 3, y) in terms of summability 
(H, a, 2, y). Continuing this process, we define the regular methods 


— 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 201 


(H, a, k+1, y), (R=0, 1, 2,---+); y>a+x; a>O0, with the associated mo- 
ment sequence 


0, 1, 2, 

It is now easy to prove the statement (xiv) using methods already estab- 
lished. Statement (xv), a special case of (xiv), clearly indicates the weakness 
of these newly defined methods. 

6. Special methods in the difference matrix. In this section we propose to 
replace the base sequence {cn} in the difference matrix A= (A"c,) by known 
regular sequences whose corresponding mass functions are continuous at 
u=1i, and then discuss the resulting methods of summation. 

Rows in the difference matrix. If in the matrix A the regular sequence as- 
sociated with summability (H, a, 1, y) is taken as the base sequence, then 
the (k+1)-st row defines summability (H, a, 1, y+). It is understood of 
course, that we normalize each row sequence by dividing each member of the 
sequence by its first term. Using (5.4, viii) we see that the efficiency of the 
new methods increases with the depth of the row in the matrix. 

If we start with the Euler-Knopp sequence, {6"}, 0<0@<1, which defines 
summability (EZ, @), as the base sequence, no change occurs as a result of re- 
peated differencing. 

E. Hille has given us an interesting example to prove that repeated differ- 
encing of the base sequence does not always improve or leave unchanged the 
efficiency of a Hausdorff method corresponding to a monotone non-decreasing 
mass function ¢(u). In this connection he utilizes the integral 


= f = 0, 


which is called the moment function of the associated Hausdorff method. The 
function c(z) is holomorphic when (z) >0, and it is continuous in R(z)20 
[15]. To show that a Hausdorff method [H, ¢:(u)] includes [H, ¢2(u) ] we 
must establish the existence of a moment function c(z) such that 


= c(z)ce(z). 
Now, let ¢(u) be a step function with two discontinuities so that 
Cn = aa" + Bb", a+ B <a; ¢< <1. 


Thus, ¢(u) is a monotone non-decreasing function. The moment function 
c(z)=aa*+b* has a set of equidistant zeros on a vertical line in the right 
half-plane. The moment function corresponding to the normalized sequence 
of the mth row of the difference matrix is c~(z)=A™c(z)/A™c(0). Now, if the 
Hausdorff method of summation defined by the sequence {cn(m)} is to in- 


4 
c 


202 H. L. GARABEDIAN AND H. S. WALL [September 


clude or be equivalent to the method defined by {c(m)}, then the quotient 
Cm(z)/c(z) has to be a moment function. In particular, it must be holomorphic 
for R(z) >0. However, c(z) vanishes for 


a 
s= + (2k + (1og—) 


and c,(z) vanishes for 


4 log B\i— wt 108 


Since (1—a)/(1—5)>1, the zeros of c(z) and c,,(z) are completely distinct, 
and their quotient is not holomorphic in the right half-plane. This establishes 
the case in point. 

Hille’s example raises the problem of determining conditions on the mass 
function ¢(u) in order that repeated differencing of the base sequence will 
yield new methods of unchanging or steadily increasing efficiency. 

Columns in the difference matrix. If we use the regular moment sequence 
associated with summability (C, a) as the base sequence in A, the (k+1)-st 
column defines summability (H, a, 1, a+k+1). These methods increase in 
efficiency with increasing k. Moreover, from (5.4, xiii), we see that if we vary 
a by any integral amount +), (p=1, 2, 3,--- ), such thata+p>0, the effi- 
ciency of the new methods, corresponding to a particular k, remains un- 
changed. If we start with the sequence associated with (H, a, 1, 7) as the 
base sequence, we get summability (H, y—a, 1, +7) in the (k+1)-st column. 
Again, the efficiency of the methods increases as we traverse the matrix to 
the right by columns. 

Starting with the Euler sequence as a base sequence we get expected re- 
sults. We obtain the sequence defined by (1—6)", 0<@<1, in every column. 
Thus there is no increase in efficiency with an increasing column index. 

Diagonal files in the difference matrix. It has already been established that 
the diagonal files of A, as well as the rows and columns, yield regular moment 
sequences provided the mass function of the base sequence satisfies appropri- 
ate continuity requirements. However, starting with familiar base sequences, _ 
the diagonal files yield regular moment sequences of a new type. It is of inter- 
est to recall that the mass function associated with these new sequences is 
constant for } 431. Asa result of the discussion in the next section it will 
be established that all of the diagonal files define methods of summation 
which include (E, 4). 

If we start with the sequence associated with summability (C, a) as the 
base sequence, the principal diagonal then yields the regular moment se- 
quence 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 203 


a 
(n + 1)Conta,n+1 


If we start with summability (C, 8) we obtain in the vth upper diagonal 
the regular moment sequence 


(6.2) n=0,1,2,---;8>0. 
(2 + + 


(6.1) n=0,1,2,---;a>0. 


If we start with summability (C, y) the wth lower diagonal gives the se- 
quence 


(n + 1)Contuty 


Observe that yu and v vary through positive integral values only. If a, B, y 
also vary through only positive integral values, all of the methods of summa- 
tion defined by the sequences (6.1), (6.2), and (6.3) are equivalent. Indeed, 
these methods still remain equivalent to each other if a, 8, y vary in any 
fashion so as to differ from each other only by integral amounts. These state- 
ments may be proved by using the same technique as employed in proving 
(5.4, xiii). To illustrate the procedure used, let {c,’ } and {c,’} be respec- 
tively the sequence (6.3) for fixed w and y, and the sequence obtained from 
(6.3) by replacing y by y +1. We propose to show that the methods of sum- 
mation defined by these sequences are equivalent. We form the quotient 


Co 
mt uty) Buty 


In the light of past experience, we see that both {c,! /c,/’} and {c,’ /c,/ } are 
regular sequences. Thus, the methods of summation in question are equiva- 
lent. 

Starting with the Knopp-Euler sequence as a base sequence, we obtain the 
regular sequence {(1—6) nan}, 0<@<1, in all the diagonals. This is clearly 
another Knopp-Euler sequence. 

Symmetry in the difference matrix. It is of some interest to find regular se- 
quences which give rise to symmetry about the principal diagonal of the dif- 
ference matrix. For the hypergeometric method (H, a, 1, y) we have 


(6.3) 


n=0,1,2,---;y>090. 


= 
Thus, the method (H, 1, 1,-y) =(C, y—1), (y>1), and the method (H, , 1, 1) 
have this property. Moreover, the Knopp-Euler method (£, 4) also has this 
property. 


ta 
# 


204 H. L. GARABEDIAN AND H. S. WALL [September 


We first noticed the symmetry of the method (H, 3, 1, 1) while considering 
the periodic continued fraction 
1 rx r(1—r)x r(1—r)x 
The function represented by this continued fraction is 
(1 — 2r) — [1 + 4r(1 — r)x]1/? 
— (1 + 2) 


Co — + Cox? 


[1 + 4r(1 — 


Aco — Acyx + Acox? —--- 
— 2rx 


(2n)! 
(6.4) Ac, = r(1 — r)**', n=0,1,2,---. 
n\(n + 1)! 

The moment sequence (6.4) for r= 4, when normalized, is the moment se- 
quence for summability (H, 3, 1, 1). 

It is easy to prove by means of Theorem 3.2 that the difference matrix 
is symmetric about the principal diagonal if and only if the function f(x) 
which generates the base sequence has a continued fraction of the form given 
in (1.5, iii) in which gont=4, (n=1, 2, 3,---). 

7. Analytic continuation. This section is devoted to some remarks con- 
cerning the effectiveness of the Hausdorff methods in the analytic continua- 
tion of a power series ) a,” outside of its conventional circle of convergence. 
The Hausdorff methods of particular interest in this respect are those for 
which the mass function ¢(u) is a monotone non-decreasing function which is 
constant in the neighborhood of u=1. 

The Hausdorff transform of the sequence {sn} is given by 


m 1 
(7.1) om = Sn = DY Cunu(1 — u)™-"s,do(u). 


n=0 0 n=0 


If ¢(u) is a monotone non-decreasing function satisfying the regularity con- ~ 
ditions (5.5), which is constant for 6<u1, we shall designate (7.1) the 
5-transform of the sequence {s,}. 
Now, from a result of Hille and Tamarkin [15], a necessary and sufficient 
condition that [H, ¢(u)]>(E, 5) is that ¢(u)=1, 5Su<1. Thus, the 
-§-method has at least the same efficiency as the Euler-Knopp method in 
the problem of analytic continuation. It will be of interest to recall the nature 
of this region of convergence. Corresponding to each singularity ¢ of the power 


Then 
P Hence 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 205 


series, draw the circle whose equation is 


1 


6 6 
These circles are tangent to the sides of the Borel polygon of summability 
at the points of singularity. The figure thus constructed is called the curvi- 
linear polygon B;. 

Knopp [19] has established B; as the region of convergence for the 
method (E, 5) for the case that 6=2-?, (p=1, 2, 3,--- ). Using the methods 
of Knopp, we can readily prove that the 5-transform of f(z) =)_a,z" converges 
to f(z) at a point z inside of B;, but we have been unable to establish diver- 
gence outside B;. However, we shall offer some evidence in support of the 
following conjecture: a necessary and sufficient condition that a Hausdorff 
method of summation, corresponding to a monotone non-decreasing ¢(u) 
satisfying the regularity conditions (1.3), shall sum a power series outside of 
its circle of convergence is that ¢(u) be constant in the neighborhood of u=1. 

In support of our conjecture we shall first prove that if @(u) is not con- 
stant in the neighborhood of u«=1 then the 6-transform of the geometric se- 
ries ).2" diverges for z= —1—€, €>0. We have for this case 


n=0 


(uz + 1 — u)"do(u). 


If z= —1-—e, then 
1 i+e 


(1 — 2u — eu) "do(u), 


om u — eu)™dd(u) |. 


Let m be even. Then, for €,>0, e.<e€, 7=(2+4)/(2+€), we have 
1 


€1 u) = €:)™|o(1) — o(n) |. 


Hence, |o,,|— as mo. 

Next, we shall sum the series }.2" with a special 5-method. Incidentally, 
the regular sequences (6.1), (6.2), and (6.3) define 5-methods. Other ex- 
amples are readily constructed. As a case in point, let ¢(u)=u/6, OS ui, 


and 


206 H. L. GARABEDIAN AND H. S. WALL [September 


0<6<1;¢(u)=1, Let us test the corresponding 6-transform on the 
1 5 6” 
f u"du = 
6 Jo n+ 1 


17: 
—f u™(1 — u)™-"du. 
6 Jo 


series We have 


Then, the 6-transform of the series doz" is 


m 1 Z & om 
nin = Caw 1 m—nd 


2(6z + 1 — 
611 —2)%m+1) — s)*%(m+ 1) 


Clearly the transform converges to 1/(1—z) as m— ©*whenever 
(7.2) 


and diverges to + © whenever ls+(1 —6$)/8| >1/5. Evidently, the region B; 
is the largest region of convergence for the special case under consideration. 
Finally, the fact that the inequality (7.2) becomes | z| <1 as is consistent 
with our conjecture. 


REFERENCES 


1. L. L. Silverman, On the Definition of the Sum of a Divergent Series, University of Mis- 
souri Studies, Mathematics Series, vol. 1, no. 1, 1913. 

2. O. Toeplitz, Uber allgemeine lineare Mittelbildungen, Prace Matematyczno-fizyczne, 
vol. 22 (1911), pp. 131-119. 

3. F. Hausdorff, Summationsmethoden und Momentfolgen, I and II, Mathematische Zeit- 
schrift, vol. 9 (1921), pp. 74-109, 280-299. 

4. H.L. Garabedian, Hausdorff matrices, American Mathematical Monthly, vol. 46 (1939), 
pp. 390-410. 

5. F. Hausdorff, Uber das Momentenproblem fiir ein endliches Interval, Mathematische 
Zeitschrift, vol. 16 (1923), pp. 220-248. 

6. H. S. Wall, Continued fractions and totally monotone sequences, these Transactions, vol. 
48 (1940), pp. 165-184. 

7. T.J. Stieltjes, Recherches sur les fractions continues, Annales de l'Université de Toulouse, 
vol. 8, J (1894), pp. 1-122; vol. 9, A (1895), pp. 1-47. 

8. T. J. Stieltjes, Oeuvres, vol. 2, Société Mathématique d’Amsterdam, Groningen, 
P. Nordhoff, 1918. 

9. W. T. Scott and H. S. Wall, A convergence theorem for continued fractions, these Transac- 
tions, vol. 47 (1940), pp. 155-172. 

10. I. J. Schoenberg, Uber die asymptotische Verteilung reeller Zahlen mod 1, Mathematische 
Zeitschrift, vol. 28 (1928), pp. 171-199. 

11. O. Perron, Die Lehre von den Kettenbriichen, Leipzig and Berlin, Teubner, 1913. 


and 


1940] HAUSDORFF SUMMATION AND CONTINUED FRACTIONS 207 


12. K. Knopp, Theorie und Anwendung der unendlichen Rethen, Berlin, Springer, 1922, 2d 
edition, 1924. 

13. T. H. Gronwall, Comptes Rendus de |’Académie des Sciences, Paris, vol. 158 (1914), 
p. 1664. 

14. C. N. Moore, ibid., vol. 158 (1914), p. 1774. 

15. E. Hille and J. D. Tamarkin, Questions of relative inclusion in the domain of Hausdorff 
means, Proceedings of the National Academy of Sciences, vol. 19 (1933), pp. 573-577. 

16. G. H. Hardy and M. Riesz, The General Theory of Dirichlet’s Series, Cambridge Mathe- 
matical Tracts, no. 18, 1915. 

17. H. L. Garabedian and W. C. Randels, Theorems on Riesz means, Duke Mathematical 
Journal, vol. 4 (1938), pp. 529-533. 

18. J. Pierpont, The Theory of Functions of a Complex Variable, vol. 2, New York, Ginn, 
1912. 

19. K. Knopp, Uber das Eulersche Summierungsverfahren 1, Mathematische Zeitschrift, 
vol. 15 (1922), pp. 226-253; II, ibid., vol. 18 (1923), pp. 125-156. 


NORTHWESTERN UNIVERSITY, 
Evanston, ILL. 


i 

4 
] 
| 


POLYADIC GROUPS 


BY 
EMIL L. POST 


TABLE OF CONTENTS 


SECTION 


Introduction 


I, GENERAL THEORY OF POLYADIC GROUPS 


. Definition of a polyadic group 

. Identity, inverse, 

. The coset theorem. 
. Subgroups and transforms; expansion in cosets . 
. Reducibility 

. Arbitrary containing eodinery: groups 


Determination of all types of somi-ebelianisms 


. On the construction of polyadic groups. 


II. FINITE POLYADIC GROUPS 


A. Mm-ADIC SUBSTITUTIONS AND SUBSTITUTION GROUPS 


. The symmetric m-adic substitution group of degree n 


2™-1-fold classification of m-adic substitutions; the m-adic cherasting groups . 


. Associated and containing ordinary groups; commutative m-adic substitutions 

. Further study of the complete m-adic 6-group and m-adic alternating groups . 

. Transitive m-adic substitution groups . 

. Intransitive m-adic substitution groups. : 

. Substitutions which are commutative with each ad the eubstitations é a transitive 


m-adic substitution group 


. Holomorphs of a regular m-adic substitution group 

. m-adic groups of u-adic substitutions. 

. Primitive and imprimitive (m, groups 

. Multiple transitivity; cyclically transitive m-adic substitution ; groups 
. Class of an m-adic substitution group 


B. FINITE ABSTRACT POLYADIC GROUPS 


. Cyclic polyadic groups; ordinary theory 

. Cyclic polyadic groups; polyadic theory ne: 

. Abstract polyadic groups of the first three orders : 

. Properties of transforms . 

. Generation of polyadic groups by two ume, one invariant under the ‘elements of 


the other 54 
m-adic groups of order 4 g prime to n—1 


. Sylow subgroups of order p* with g/p* prime 
. Representation of an arbitrary m-adic group as a regular wedic eubstitution ; group . 
. Invariant subgroups and quotient groups; the m-adic central quotient group . . . 


Presented to the Society, October 26, 1935; received by the editors January 4, 1940. 
208 


PAGE 

10. . . 250 

11 253 

12 255 

13 261 

14 262 
15 

16 

17 

19 

20 

4 21 282 

q 22 286 

33 293 

295° 

25 

26 304 

; 27 307 

28 312 

29 313 


POLYADIC GROUPS 


. Commutator, semi-commutator, and quasi-commutator subgroups 

. The ¢-subgroup of an m-adic group 

. Simply isomorphic m-adic groups; group of inner isomorphisms 

. Extension of Frobenius’s theorem to m-adic groups 

. Representation of an abstract m-adic group asa transitive (m, u) substitution group . 


C. FINITE m-ADIC LINEAR GROUPS 


. m-adic linear transformations 

. m-adic collineations and collineation-groups 

. m-adic Hermitian invariants 

. Reduction to canonical form 

. m-adic invariants 

. Generalization of m-adic substitution and transformation groups 


INTRODUCTION 


The group concept is peculiar in the breadth of its application and the 
narrowness of its formulation. By modifying one or more of its restric- 
tions there have resulted such concepts as that of semi-group, groupoid, 
mischgruppe, quasi-group, hypergroup, multigroup. In all of these general- 
izations of the group concept the group operation remains dyadic, that is, 
it is a function of two independent variables. Our present interest is in that 
generalization of the group concept which results when, while retaining all 
other of its special features, the group operation becomes polyadic, that is, a 
function of any finite number of independent variables. 

As far back as 1904, E. Kasner thus considered generalizing the ordinary 
“group property,” and called a set of elements closed under a k-adic operation 
a k-adic system('). But the complete formulation of this generalization seems 
to have been first effected by Dérnte(?) in 1928 in a paper containing an ex- 
tensive theory of what he there terms ”-groups, m being the number of inde- 
pendent variables in the operation. In 1932 Lehmer(*) independently 
formulated and investigated the special concept he termed triplex, which, 
in Dérnte’s terminology, is an abelian 3-group. Dérnte’s m-group, to change 


(?) While the paper in question, An extension of the group concept, has not appeared in 
print, an abstract thereof will be found in the Bulletin of the American Mathematical Society, 
vol. 10 (1904), pp. 290-291. Though at one point of the abstract Kasner observes that “the 
law of combination of the general system is best exhibited by means of its k dimensional 
multiplication table,” his original definition adds the requirement that the combination of no 
fewer than k elements shall be contained in the system—a requirement that is meaningless 
unless the k-adic operation itself is merely an extended product based on a prior dyadic opera- 
tion. And the absence of any mention of an associative law, coupled with a reference to the in- 
verse of an element, further suggests that, as in Miller’s perfect cosets referred to below, this 
dyadic operation is understood to be that of some actual group in the ordinary sense containing 
the given system. 

(?) W. Dérnte, Untersuchungen iiber einen verallgemeinerten Gruppenbegriff, Mathematische 
Zeitschrift, vol. 29 (1928), pp. 1-19. 

(*?) D. H. Lehmer, A ternary analogue of abelian groups, American Journal of Mathematics, 
vol. 54 (1932), pp. 329-338. 


209 
30 316 
31 322 
32 324 
33 327 
34 328 
35 
36 
37 
38 
39 
40 


210 E. L. POST [September 


the symbol, is also our m-adic group, or, for unspecified m, our polyadic 
group(*). 

As examples of triadic systems, and these also are examples of triadic 
groups, Kasner mentions “the odd permutations in any number of letters, 
the 2 central symmetries of the plane or the * of space, the totality of 
dual or reciprocal transformations, the correlations contained in any projec- 
tive group, the totality of conformal transformations of the plane which re- 
verse angles.” In the introduction of his paper Dérnte mentions, among other 
examples, residue classes modulo k as (k+1)-groups, and in the body of his 
paper introduces many such arithmetical illustrations as exemplifiers of his 
abstract development. Apart from examples which are the subject of a major 
part of our theory, we may add the linear transformations of determinant 
an (m—1)-st root of unity as an m-group, and, more significantly, the m-group 
consisting of all the substitutions of a group which, instead of carrying a fixed 
letter into itself, transform say d2—d3, - , Gm—1— 4. In all of these 
examples the polyadic operation is merely an extended product expressed in 
terms of a prior dyadic operation. On the other hand, lengths under the oper- 
ation fourth proportional, now to be written b:a=c:x, constitute a 3-group 
in which, geometrically, the triadic operation is primary(®). Even more so 
for an abstract m-group whose operation is given ab initio by an m-dimen- 
sional table. 

While the abstract formulation of polyadic group must be credited to 
Dérnte, in its coset theorem the present paper may be said to solve the prob- 
lem of determining the essential nature of a polyadic group. This basic result 
is to the effect that any m-adic group can have its class of elements so wid- 
ened, and in that widened class a dyadic operation so introduced, that the 
enlarged class, under that operation as product, constitutes an ordinary group 
in which the class of elements of the m-adic group is a coset of an invariant 
subgroup of the ordinary group, and the operation of the m-adic group the 
product of m elements as elements of the ordinary group(*). At first glance 
this theorem seems to be identical, for finite groups, with a result of Miller’s 


(*) The present paper arose as a reaction to the importance ascribed to the group concept 
by C. J. Keyser in his Mathematical Philosophy, New York, 1922, Lecture XII. But see the 
next to the last paragraph of this introduction. We may note that an early attempt on our 
part to thus generalize the group concept on the basis of its fourfold characterization had 
failed. But on now turning to the twofold basis as given by Miller (Finite Groups, below, p. 52) 
we found generalization to be immediate. 

(®) Analytically, the operation becomes x= (ac)/b, and so a variant of a—b+-<¢, easily seen 
to lead to a 3-group. This last is already present in Dérnte’s paper, and generalized in his 
Theorem 7, §1. Note that geometrically even, the binary operation multiplication can never- 
theless be defined, even if secondary. The about-to-be-mentioned coset theorem shows the same 
situation to obtain in general. 

(*) Cf. A. Suschkewitsch, Uber die Erweiterung der Semigruppe bis zur ganzen Gruppe, 
Communications de la Société Mathématique de Kharkoff, (4), vol. 12 (1935), pp. 81-87. 


. 
i 


1940] POLYADIC GROUPS 211 


of 1935(7). But, apart from other differences in hypothesis, Miller obtains the 
coset conclusion by essentially assuming the given set of elements to be in 
an ordinary group(®). However, as a result of the two theorems, finite poly- 
adic group does become identical with Miller’s “perfect co-set,” some of whose 
properties he develops, provided the latter is understood to mean set of ele- 
ments and polyadic operation thereon(°). 

In addition to differences in abstract development, the present paper goes 
beyond Dérnte’s in generalizing the concepts of substitution and linear trans- 
formation in such a way that the resulting m-adic substitutions and m-adic 
linear transformations naturally lead to m-adic groups thereof (see §9 and 
§35 for their definition). These m-adic groups we study as generalizations of 
ordinary substitution and linear transformation groups. As incentive for this 
development, we have the theorem that any abstract m-adic group (finite) 
can be represented as a “regular” m-adic substitution group, a theorem which, 
indeed, first gave us our coset theorem. In the final section of the paper these 
concepts receive a wide extension which remains significant for ordinary 
groups. But they are then seen to be at least closely related to a type of ordi- 
nary group formulated by Specht(?°). 

Intermediate between these generalizations of substitution group is one 
which includes m-adic groups of ordinary substitutions. Two of our examples 
given above are of this type. In this connection we may mention a work of 
Corral('") referred to by Miller. With substitutions on a given finite set of 
letters in question, Corral calls a set of substitutions a perfect brigade if closed 


under the operation ABC, an imperfect brigade if closed under the operation 
AB-'C. The former is then identical with a 3-group of ordinary substitutions, 
the latter with a schar of substitutions, schar in Baer’s(!*) wider form of a 
concept due to Priifer(*). Priifer’s development had a great influence on 


(7) G. A. Miller, Sets of group elements involving only products of more than n, Proceedings 
of the National Academy of Sciences, vol. 21 (1935), pp. 45-47. All references to Miller other 
than to Finite Groups (below) concern this paper. 

(8) The closing statement in Kasner’s abstract, which suggests an anticipation of our coset 
theorem for triadic groups, is more probably merely related thereto in similar fashion. 

(*) His condition that his set S contain no like subset is in error. Recognizing S asan 
(n+-1)-group of order h, we see from our §21 that his partial condition “h is a power of 2” should 
be “every distinct prime factor of h is a factor of n.” 

(#°) W. Specht, Eine Verallgemeinerung der Permutationsgruppen, Mathematische Zeit- 
schrift, vol. 37 (1933), pp. 321-341. 

(4) J. I. Corral, Brigadas de Substituciones, Part I, Havana, 1934; Part II, Toledo, 1935. 

(#) R. Baer, Zur Einfiithrung des Scharbegriffs, Journal fiir die reine und angewandte 
Mathematik, vol. 160 (1929), pp. 199-207. His abstract formulation occurs in the important 
footnote on page 202. (Condition III therein can be proved in its entirety, and so is unneces- 
sary.) The same footnote proves, in our terminology (see §5), that every schar is reducible to an 
ordinary group. Had the same situation obtained for polyadic groups, there would have been no 
need of our coset theorem. 

(8) H. Priifer, Theorie der Abelschen Gruppen, 1. Grundeigenschaften, Mathematische Zeit- 
schrift, vol. 20 (1924), pp. 165-187. 


} 


212 E. L. POST [September 


Dérnte who showed that by rewriting the operation A B-'C formally as ABC, 
Priifer’s schar becomes a special kind of 3-group. This reinterpretation is 
however no longer possible if the Priifer hypothesis A B-'C = CB-4 is deleted 
to give Baer’s schar. 

While Dérnte’s development in large measure consists in extending 
Priifer’s schar results to m-groups, our own work correspondingly attempts 
to generalize ordinary group theory. Thus, at the very beginning of our de- 
velopments, where Dérnte’s recognizes.) ‘dentity for an m-group with m>2, 
we find that role played by certain sequences of m—1 elements of the m- 
group, and are thus led to a development culminating in the coset theorem 
of §3. The remainder of Part I, which is really a theory of abstract polyadic 
groups finite or infinite, consists of largely unrelated topics, but each funda- 
mental in the theory. Our program crystallizes in Part II which, in A, B, C, 
systematically generalizes most of the general topics of three chapters in the 
Miller, Blichfeldt, Dickson, Finite Groups(**), that is, Miller’s Chapters II 
and III on substitution groups and abstract groups respectively, and Chapter 
IX, Blichfeldt’s introductory chapter on linear groups. The reader will find 
here certain developments which merely paraphrase the ordinary theory, 
others which are far richer in their polyadic form, and still others which have 
no counterpart in ordinary theory. On the whole, the amount that does go 
over is surprisingly large. The principal failure is the but partial extension 
of Sylow’s theorem. To the student of ordinary groups we may point out, 
among other connectidns, that the generalizations quasi-abelianism and 
quasi-commutator subgroup of §30 remain significant for ordinary groups, 
that §5 also gives a polyadic superstructure to any ordinary group, and that 
the coset theorem could be used to translate polyadic group results inde- 
pendently arrived at into ordinary group properties. While much of Dérnte’s 
paper becomes clarified by means of our coset theorem, and several of his 
developments are carried considerably further in our own work, the present 
paper by no means can be said to supplant Dérnte’s. We are furthermore di- 
rectly indebted to him for his concepts of semi-invariant subgroup and semi- 
abelian group. 

Useful as the coset theorem is in establishing certain properties of polyadic 
groups, its very existence greatly minimizes the significance of that general- 
ization. Nevertheless, we cannot agree with Miller who says “the generaliza- 
tion secured by using perfect cosets instead of groups is, however, only 
apparent.” In its autonomous formulation, polyadic group is fundamentally 
a generalization of ordinary group and, indeed, it is as generalization that 


(4) New York, 1916. Henceforth referred to as Finite Groups. Where in Part II the writer 
refers to the standard proof of an ordinary group result it is the proof in this text that is meant. 
We may note here that when an ordinary group term is applied without explicit definition to 
polyadic groups, its polyadic definition is entirely similar. 


1940] POLYADIC GROUPS 213 


it lends itself to a corresponding development('*). However, the final verdict 
will undoubtedly hang on the question of application(*). For this end our 
concept of m-adic invariant is no doubt far too special (see §39). Genuine 
application of polyadic groups will probably therefore have to wait upon the 
formulation of an adequate concept of polyadic invariant. 

We wish here to express our obligation to B. P. Gill to whose efforts we 
owe the completion of a major phase of our development (see §12). Had we 
completed the determination of the triadic linear groups in two variables 
mentioned in our preliminary report, this obligation would have been still 
greater. We are also indebted to R. Baer who, on two separate occasions, set 
us on the right path in the maze of ordinary group literature. 


I. GENERAL THEORY OF POLYADIC GROUPS 


1. Definition of a polyadic group. Given a class of elements C, and an 
operation c(siS2--- Sm), we shall say that the elements of C constitute an 
m-adic group G under c if the following two conditions are satisfied: 

1. If any m of the m+1 symbols in an equation of the form 


Sm) = Sm4i 


represent elements in C, the remaining symbol also represents an element in 
C, and is uniquely determined by this equation. 

2. The elements of C satisfy the associative law under c, that is, they 
satisfy 


C(c(S1S2 Sm)Sm41Smy2 Sem—1) = C(S1C(S2* SmSm41)Sm42 * Stm—1) 


(45) It is fundamental to remember, in this connection, that we are dealing not with a 
mere class of elements, but with a class of elements and an operation thereon; still better, 
with the properties of a class of elements under a given operation. Thus the genuineness of 
non-Euclidean geometry is not affected because it can be represented by certain constructions 
in Euclidean geometry. Had Miller’s point of view been adopted, such a development as that of 
§5, for example, would hardly have been possible. 

(#6) E.g., such as the Galois theory in the case of ordinary groups, not applications, such 
as the examples given above, which are mere illustrations of polyadic groups or of the theory 
thereof. Much of Corral’s development concerns a brigade Galois theory. But this seems to the 
writer to be merely a restatement of standard Galois theory in terms of brigades rather than 
a genuine application. 

(17) This formulation, patterned by the author after Miller, is identical with Dérnte’s 
except that Dérnte splits up our 1 into two parts, P; and P3, according as Sm4i, or Si, i#m+1, 
is to be determined. It is then readily proved by the methods of our next section that in P; 
only the existence of the solution S; need be postulated, its uniqueness being then provable. 
It can further be shown that this existence of a solution for S; need only be universally postu- 
lated either for a single i with 1<i<™m, or for both i=1 and i=™m, the existence of a solution 
for S; for all other z’s from 1 to m then being provable. If the second form be used in place of Ps, 
and the first can only be used for m>2, the resulting set of postulates would be the exact 
generalization of the basis for ordinary groups used by Albert in his Modern Higher Algebra. 


4 


214 E. L. POST , [September 


We shall also use Dérnte’s phrase “m-group” for G. Though these conditions 
are vacuously satisfied when C is a null class, the ordinary group concept 
tacitly assumes the existence of at least one element, and so we make the 
same assumption here. An ordinary group is then immediately an m-adic 
group with m=2, that is, a dyadic group, or 2-group. Unlike Dérnte, we ex- 
clude the case m= 1. 

It is readily proved by induction that the number of elements entering 
into any combination of elements built up by the operation c is of the form 
k(m—1)+1, where, in fact, k is the number of c’s in the assumed symbolic 
expression of this “extended operation.” As the basic operation ¢(sS2 - - - Sm) 
is on an ordered m-ad of elements, an extended operation built up by c’s 
orders the k(m—1)+1 elements appearing therein in a linear array 
Si, Sa, °° * » Skcm—141- It is then readily proved that as a consequence of the 
associative law 2 the element given by such an extended operation depends 
only on the sequence 51, Se, , Sk¢m—1)41, and is independent of the particular 
way in which parentheses are introduced in conjunction with the k c’s that 
must enter into such an expression. We are justified, then, in briefly writing 
any such extended operation ¢(siS2 Sk¢m—1)+41)- 

2. Identity, inverse, equivalence. Let ai, a2, - - - , @m—1, Gm be elements of 
an m-adic group G satisfying the equation 


Assuming as we do that m=2, we can, in fact, let a, and m—2 of the m—1 


elements @, de, - - - , @m—1 be arbitrary elements of G, and then determine the 
remaining element in accordance with 1 of §1 so that this equation will be sat- 
isfied. If now s be any element of G, we can likewise find se, 53, --- , Sm in G 
so that c(@mS5253 - - - Sm) =s. By our assumed equation we will have 


Gm—12m) S253 * Sm) = C(GmS2S53 * Sm). 
Hence, by the associative law, 


and so 
Gm—1S) = S. 


That is, if the equation c(a,d2 - - - Gm1S) =S holds for one s in G, it holds for 
every s in G. The sequence, or (m—1)-ad, {a1, Qe,***, On—1} may then be 
called a left identity of G. In the same way we can show that #f c(sbybe - - - bm—1) 
= holds for one s in G, it holds for every s in G, and {bi, be, +++, bm} may be 
called a right identity of G. 

We now prove that every left identity of G is a right identity, and conversely, 
thus arriving at the unique concept of an (m—1)-ad as an identity of an 
m-adic group. Let { a1, be a left identity. Then - Gm—1@1) 
=a,. By the associative law, 


1940] POLYADIC GROUPS 215 


Hence 


Since the first m—1 arguments of the two members of this equation are iden- 
tical, the last must also be equal by 1, §1. Hence 


C(Gm—10102 = Om—1, 


is also a right identity. Similarly for the converse. 

Our equation c(aid2 - - - @m—1@1) =a; shows that if ae, - , isan 
identity, so is { ae, Hence also { as, a2}, and so 
on. Of course we have used the preceding result on left identities being the 
same as right identities. In general, then, if Qi, 1s 
an identity, so is Otherwise stated, cyclic per- 
mutation of the elements of an identity leaves it an identity. 

Our initial observation proved the existence of an identity for m22. 
Clearly, if { a, Q2,**-, Om—1} is an identity, it is immaterial which m—2 of 
these elements were assumed arbitrarily. Hence all identities of an m-adic 
group can be obtained by arbitrarily assigning values to, say, a1, de, - - - , dm—2, 
and correspondingly determining @,_;. If G be of finite order g, there are 
g™-! (m—1)-ads formed from elements of G. Hence G has g™~? identities. 
There will be no ambiguity if we use similar terminology when g is infinite. 

While the term identity will thus mean an (m—1)-ad of the above kind, 
a corresponding development in connection with an extended operation on 
k(m—1)+1 arguments leads to what may be termed an extended identity 
in the form of a k(m—1)-ad. Except for their number, extended identities en- 
joy the same properties as identities. Rather unsymmetrically we may say 
that { a1, is an extended identity if { a, 
C(Am—1 * * @ecm—1)) } is an identity. 

The concept of identity immediately leads to that of inverse. For m=2, 
the inverse of an element s is an element which multiplied into s yields the 
identity. For m>2, to obtain an identity from an element s we must annex 
m—2 other elements. We are thus led to an (m—2)-ad as an inverse of s. 
Hence, for m>2, an inverse of an element is an element when and only when 
m= 3. Sm—2} is then an inverse of s if {s, $i, Se, is an 
identity. As {s1, Sa, *** , Sm—2, s} is then also an identity, we may therefore 
say that s is an inverse of the (m—2)-ad {s1, 52, --- , Sm-2}. We are thus led 
to define inverse for i-ads with arbitrary 1. 

First let i<m—1. We then define an inverse of an i-ad j51, 52,°--, si} 
to be an (m—i—1)-ad {s/, such that Si, 
then also an identity, si} is an inverse of {s/, 
so that we can talk of a pair of inverse polyads. When i=m—1 we must 


| 
4 
‘ 
i 
x 
al 
‘all 
| 
al 
| 


216 E. L. POST [September 


have recourse to an extended identity, and are thus led to an (m—1)-ad 
as inverse. sf,---, Sma} is then an inverse of {s1, se,---, Sm—1} if 
{ si, So, Si, Sea} is an extended identity. As before, 
51, Sm-1} is also an inverse of {s/, 

By means of inverses we easily solve an equation of the form 


for s(!8). Let af,---, bs,---, bf} be inverses of 
{ai, ai}, {bi, be, ---, respectively. Operating on both sides 
of the above equation by c(a/ az - - - On—;—1| bi bg - - - bf), the bar indicating 
the missing argument, applying the associative law, and reducing the left- 
hand side by the property of identities we obtain 


pe 


When a’s or b’s are missing, our inverse of an (m—1)-ad serves the same pur- 
pose. Clearly an equation of the same type arising from an extended operation 
can always be reduced to the above type by means of the associative law. 
Our need of inverses of t-ads with 1>m—1 is thus not pressing. However, 
they can be similarly introduced by means of extended identities. While such 
an inverse can always be a j-ad with 1<j7<m-—1, to preserve the symmetry 
of the inverse relationship we must allow 7 >m—1 as well, and thus have to 
introduce extended inverses. Thus if +=k(m—1)+/, 0S1<m-—1, an inverse 
will be an (m—1—1)-ad, while all extended inverses will have j in the form 
xk(m—1)+(m—I—1). 

The multiplicity of inverses when the latter are not single elements leads 
to the concept of equivalent i-ads. We can introduce that concept directly, 
however, as follows. Let {a:, a2,---, ai} and {bi, be, ---, bs} be such that 
for some specific d;, , dj, @1, * @m—i-j 


that is, replacing the sequence a, dz, - - - , ai by bi, be, - - - , b; in the specific 
operation given by the left-hand member of this equation leaves the result 
unaltered. Let {d/,---, dwf_j;1} and {ef,---, ef4;.} be inverses of 
{ di, ,d;}, {e1, » respectively, and let 51, - Se, Sm—i 


(18) Dérnte solves this equation for m>2 by his “querelement” 4, defined as the solution 
of the equation c(a- - - ax) =a for x. The very economy of this concept, however, helps ob- 
scure the concepts of our present section, so necessary for the basic coset theorem. It may be 
pointed out that actually our method of solution can be so presented as to be independent of 
the previous theorems on identities, and thus leads to that part of the footnote to §1 concerning 
the provability of the uniqueness of the solution. Indeed, in this primordial form, the same 
method is constantly used by Dérnte without specific formulation. The reader may be inter- 
ested in noting that Dérnte’s Theorems 3 and 4, §1, may be considered special cases of our 
identity results in that the definition of @ may now be restated: {a,---, a, 4} is a right 
identity. 


1940] POLYADIC GROUPS 217 


be arbitrary elements of G. Operating on both sides of the above equation by 
obtain, after simplification, 


A similar argument can be given when j, or m—i—j, is O or m—1. Hence, if the 
sequence bbe - - - b; can replace ajd2 : - - a; somewhere in one operation it can 
do so anywhere in any operation(!*). Clearly the same result holds good for 
extended operations as well. The i-ads d2,---,ai}and be, ---, d;} 
will then be said to be equivalent. Thus we may define an m-group G to be 
abelian if the dyads {s:, ss} and {ss, s:} are equivalent for every pair of ele- 
ments Si, S2 of G. For then the value of c(sis2 - - - Sm), s’s in G, is unaltered by 
any interchange of adjacent s’s, and hence by any permutation of all the s’s. 

Let { a1, a;} and { bi, be, - b;} be equivalent i-ads, and 
let {a/, be an inverse of ae, ---,a;}. We have then 
- ais) =s. Hence also c(a{ ag - - - - - bis) =s 
so that {ai/, az, is an inverse of {bi, be, as well. A simi- 
lar argument applies when 1=m—1. That is, every inverse of one of a pair of 
equivalent i-ads is also an inverse of the other. Again, let { ai, Q2,**-, a;} and 
{bi, be, b;} both be inverses of {ai, Since we then 
lows that { a1, a;} and be, b;} are equivalent. That is, 
inverses of the same polyad are equivalent. It follows from these results that 
if {ai, is an inverse of { a, a:}, the class of 
inverses of { a;} is the class of (m—i—1)-ads equivalent 
to {ay Conversely, the class of i-ads equivalent to 
{ a, ,a;} is the class of inverses of {ai, Finally, 
the first class is the class of inverses of each member of the second, and con- 
versely. This for 1<m—1. For i=m-—1 both classes consist of (m—1)-ads. 

We shall speak of the class of all i-ads equivalent to a given i-ad as a 
class of equivalent i-ads. As in the case of identities, to obtain all i-ads equiva- 
lent to a given i-ad we may assign arbitrary values to i—1 of the elements, 
the ith being then determined. We may therefore say that a class of equiva- 
lent i-ads has g‘-! members. If, on the other hand, we keep i—1 elements 
fixed, and let the remaining element run through G, 1 of §1 shows that no 
two of the resulting i-ads can be equivalent, while each class of equivalent 
i-ads thus finds a representative. We may therefore say that for each i there 
are exactly g classes of equivalent i-ads. These classes are, or course, mutually 
exclusive. For 1=1 they are nothing more than the unit classes consisting of 
single elements of G. For i=m-—1 one class of equivalent i-ads is singled out, 
that is, the class of identities. 


(4%) This result is proved in part by Dérnte as Theorem 2, §1, but the corresponding con- 
cept is not formulated. Clearly this relationship between z-ads is an “equivalence relationship.” 


e 
: 
| 
4 
4} 
Al 
4 
‘an 


218 E. L. POST [September 


3. The coset theorem. We are now in a position to embed our m-adic 
group G in an ordinary group. Let C* be the class of all classes of equivalent 
i-ads for i=1, 2,---, m—1. Each element of C* is thus a class of equiva- 
lent t-ads, and C* may then be said to have (m—1)g elements, g for each 7. 
It is convenient to drop the distinction between a unit class and its sole 
member, so that we may consider C, the class of elements of G, a subclass 
of C*. We proceed now to define a dyadic operation on the elements of C*. 
But first we must remove the above tacit restriction 7 <_m in our discussion of 
equivalence. Clearly, by using extended operations, our results go over for 
42m. Furthermore, we can extend the concept of equivalence to allow an 
i-ad to be equivalent to a j-ad. With only the basic operation c involved, we 
must clearly have j—7 a multiple of m—1. Without further elaboration, 
{ bi, be, } will be equivalent to { a1, de, -,a;} if { bi, be, + +, 
bi+%¢m—1) } and { a, @:$ are equivalent in the original 
sense(?°), 

We first prove the following: if two of the three polyads { a1, a2,--+, a}, 
{bi, be, ++, b;}, +, as, br, be b;} are respectively equiva- 
lent to the corresponding two of the three polyads {ay , a,---, af}, 
{b/,.bf,---, bf}, {af, af,---, af, bf, bf,---, bf}, the remaining 
polyads are equivalent. We shall prove this result for i+j<m, a corre- 
sponding proof with the use of extended operations serving for i+j>m. 
Consider then the operations c(a:d2-- - aibjbe--- -- and 
c(ajaz --- b/d, - - dn_:_;). If the first and second polyads of the 
first set of three are respectively equivalent to the first and second of the 
second set of three, we will have c(aid2-- aibibe- 
bs + --- --- b/d --- 
dm—i_;), and the third polyads are equivalent. If the hypothesis concerns 
the first and third polyads, then ajbibe - bjdy dm_i_;) 
dm_:-;), whence the corresponding conclusion. Similarly for the second and 
third polyads. 

Let then the dyadic operation c*(rir2) be defined as follows. If 7; and 12 
are members of C*, and if { a1, Qa,***, a;} is in the class 7; of equivalent i-ads, 
{ bi, be, - +, b;} in the class r2 of equivalent j-ads, then c*(rr2) is to be the 
class of (t+ j)-ads equivalent to { a, Gi, bi, be, b;} when 1+] 
<m-—1, the class of (t+j—(m—1))-ads equivalent to , a 


(?) And, of course, our basic theorem on equivalent 7-ads extends to equivalent polyads: 
It may then be noted that if we include a null sequence in this framework, an independent proof 
of the identity of left and right identities results. In fact, the about-to-be-proved coset theorem 
depends only on the concept of equivalence; and the properties of identity and inverse could 
therefore be derived with the help of that theorem. Their direct formulation in terms of the 
operation of the m-group, however, will be found indispensable for correct thinking on such 
topics as those of §5. 


. 
— 


1940] POLYADIC GROUPS 219 


bi, be, b;} when 1+j >m—1. When i+j our previous results not 
only show that c*(rr2) is independent of the particular i-ad and j-ad chosen 
from 7, and 72 respectively, but that if any two symbols in the equation c*(r;r2) 
=rs are assigned values in C*, the third is uniquely determined in C*. The 
same is true when i+j>m-—1 by the transitive property of equivalence. 
Hence, condition 1 of §1 for a dyadic group is satisfied by (C*, c*); likewise 
condition 2, that is, the associative law. For let {a, Sate a;}, {bi, sey b;}, 
cr} be in 72, 73 respectively. Then, with equivalence extended 
as above, if i+j+k=l/+A(m—1), 1S1Sm-—1, both c*(c*(rirejrs) and 
c*(rsc*(rers)) represent the class of l-ads equivalent to - - -, ai, bi, - by, 
Cy,***, ce}, so that, for all members of C*, 


c*(c*¥(rire)rs) = c*(ric*(rers)). 


Hence, the members of C* constitute an ordinary group under c*. With Gas the 
given m-adic group, this ordinary group will be symbolized G*. 

We have observed that we may consider the members of G to be members 
of G*, that is, those classes of equivalent z-ads for which 7=1. We now further 
observe that the operation c(sise---Sm) can be identified with the ex- 
tended operation c*(sis2--- Sm) when, of course, the s’s are in G. For 
- Sm) is, indeed, the class of monads equivalent to {s1, 
and so consists of but the one monad c(si52 - - - Sm)(?4). We shall therefore 
call G* the abstract containing ordinary group of G, abstract by contrast with 
other possibilities to be discussed later. In fact, G* is clearly determined by 
the abstract form of G. And while G* as derived is not abstract, it may be 
made so by replacing the members of C* by symbols formally obeying the 
rule of combination c* as determined above. 

To obtain a clearer view of the relationship between G and G*, and thus, 
indeed, really to solve the problem of the essential nature of a polyadic group, 
let us consider those members of G* which are classes of equivalent (m—1)- 
ads. We have already observed that one of these g classes is the class of identi- 
ties of G. Now if in the equation 


c*(rire) = rs 


any two of the three symbols represent classes of equivalent (m—1)-ads, so 
does the third. It follows that the g classes of equivalent (m—1)-ads consti- 
tute an ordinary group under c*, and hence a subgroup of G*. We shall sym- 
bolize this ordinary group by Go, and call it the associated ordinary group of G. 
It is readily seen that Go is an invariant subgroup of G*(**). To prove that, it 


(#) If then G has but a finite number of elements, Miller's theorem concerning perfect 
cosets can be applied immediately to give the coset theorem that follows. However we here 
make no such restriction on G. 

(#) Provided m>2. For m=2, G*=G=Gp. If then we here allow the term subgroup to in- 
clude the group itself, the results of the present section are also valid for ordinary groups, 
though in trivial fashion. 


| 

j 
i 
4 
al 
| 
- 


220 E. L. POST [September 


is sufficient to show that in the equation 
c*(tr:) = c*(rire) 


if t is in Go, m: in G*, then r2 is in Go. But if 7 is a class of equivalent 7-ads, 
t being a class of equivalent (m—1)-ads, then c*(tr:), and hence c*(rre), is 
also a class of equivalent i-ads. 72, then, can only be a class of equivalent 
(m —1)-ads, as was to be proved. 

Let us now expand G* in cosets as regards its invariant subgroup Go. As 
in the invariance proof, if a multiplier r represents a class of equivalent z-ads, 
the corresponding coset consists of classes of equivalent i-ads, and indeed, 
constitutes the class of all g classes of equivalent i-ads. While this is im- 
mediate when g is finite, in any case if 7 is a class of equivalent i-ads, the 
equation c*(rer) =r; demands that rz be in Go, so that 7; is in the coset in 
question. Hence the expansion of G* as regards Gp consists of exactly m—1 
augmented cosets, each being the class of all g classes of equivalent t-ads, for 
some 1=1, 2, ---,m-—1. The elements of G itself therefore constitute one of 
these cosets, that is, that one for which 1=1. Hence our basic theorem. Every 
polyadic group is a coset of an ordinary group with respect to an invariant sub- 
group, it being understood that the polyadic operation of the polyadic group 
is an extension of the dyadic operation of the ordinary group. 

With the relationship between G, Gp and G* made thus precise, it becomes 
desirable to simplify our notation. Hence, when but a single m-adic opera- 
tion c is involved, we shall write the corresponding dyadic operation c*(r172) 
simply as the product 772 of standard group theory. Our identification of 
C(SiS2 Sm) with - - - Sm) therefore enables us to write c(siS2-- - Sm), 
simply, 5152 - - - Sm. We now finally introduce the completely abstract view 
of G* with symbols for elements. Clearly the element of G* corresponding to 
the class of identities of G is the identity of G*, and so will be symbolized 
by 1, as usual. With the elements of G* as symbols, it will be convenient to 
call the symbol r, representing a class of equivalent i-ads, an i-ad. Thus 
$1S2 - - - S; will be an t-ad when the s’s are elements of G. Conversely, every 
i-ad can be written thus. In particular, Go, itself, consists of all distinct prod- 
uctS S152 of m—1 elements in G. To avoid duplication, of course; we 
may keep m —2 of these elements fixed, and let the remaining one run through 


G. 


In particular, if s is an element of G, s‘ is an i-ad, and so may correspond- 
ingly be used as multplier in the expansion of G* in cosets as regards Gp. We 
may therefore write this expansion 


G* = Gos + Gos? + + Gos"? + Go = sGo + S°Go + +5" Go + Go. 
Most significantly we may then also write 


G= Gos = sGo. 


1940] POLYADIC GROUPS 221 


Since Gp consists of products of elements of G, we see that G* itself is 
generated by the elements of G. The expansion of G* shows the quotient 
group G*/G, to be of order m—1, and, indeed, cyclic, with the element corre- 
sponding to the given polyadic group G as generator. Our coset theorem is thus 
more precise than its brief formulation, given above, would indicate. 

By means of this theorem we shall be able to prove many results concern- 
ing polyadic groups by means of known results on ordinary groups. On the 
other hand, the following almost immediately obvious converse enables poly- 
adic group theory to make contributions to a certain aspect of ordinary group 
theory. To wit, if a coset of an ordinary group with respect to an invariant sub- 
group is of finite order m—1 as element of the corresponding quotient group, then 
the elements of the coset constitute a polyadic group under the product of m ele- 
ments as operation(*). Though easily proved directly, this result may be con- 
sidered a consequence of the general theorem of §8. It will also be generalized 
at the end of the next section. Note that such a result cannot be true for a 
coset corresponding to an element of infinite order of the quotient group. 

4. Subgroups and transforms; expansion in cosets. Dérnte has treated 
the subject of expansions of polyadic groups in cosets exhaustively. While not 
possessing identities and inverses to lead to a concept of transforms, he was 
enabled adequately to treat invariant subgroups by mere commutativity 
properties. He further introduced what we shall refer to as semi-invariant 
subgroups, a concept which the writer completely overlooked in his own de- 
velopment, and was thus led to a more general concept of polyadic quotient 
groups than is given by invariant subgroups. Nevertheless we shall reexamine 
these concepts from the point of view of the coset theorem, and a theory of 
transforms, sirice not only do they become clearer thereby, but indeed admit 
of a certain degree of generalization. 

A proper subclass of the class of elements of an m-adic group G will be 
said to constitute a subgroup H of G if the elements of that subclass consti- 
tute a polyadic group under the polyadic operation of G. This is clearly equiv- 
alent to the following. If in an equation c(siS2 - - - Sm) =Sm41 any m elements 
are in the subclass, the (m+1)-st is. For the rest of the definition of m-adic 
group follows from the elements of the subclass being in G. Where no confu- 
sion can result we shall occasionally allow G to be a subgroup (improper) of 
itself. We proceed first to investigate the relationship between H* and G*, 
Hy and Gp. 

With H* and G* considered as being composed of classes of equivalent 
1-ads, only those members of JJ* which are in H will also be members of G*. 
For if {s1, Se,¢*, si} is an i-ad of H, and hence also of G, the class of H 
t-ads equivalent to {si, iy si} is but a proper subclass of the class of G 
1-ads equivalent to {s1, Sa,°**y si} whenever 7 >1. Nevertheless a 1-1 corre- 
spondence is thus set up between the members of H* and the members of G* 


(#) Already proved by Miller in equivalent form for finite groups. 


222 E. L. POST ; [September 


containing them. For the latter are mutually exclusive. Hence, when G* is 
treated abstractly with symbols as elements, we may symbolize the members 
of H* correspondingly; and as the operation c*(s5s2), that is, 5:52 as explained 
above, when set up for G* now serves also for H*, H* thereby becomes a sub- 
group of G*. 

The (m—1)-ads of H* are then also (m—1)-ads of G*, so that Hp is a sub- 
group of Go. If s is any element of H, we can simultaneously expand H* and 
G* in the form 


H* = Hos + Hos? +--+ + Hos™? + Ho, 
G* = Gos + Gos? + + Gos™-? + Go. 


It follows that the m—1 augmented cosets of H* as regards Hp are respec- 
tively contained in the m—1 augmented cosets of G* as regards Go. As an 
immediate consequence, we have Lagrange’s theorem holds for finite polyadic 
groups. For, defining the order of a polyadic group as the number of its ele- 
ments, the relations G=Gys, H=Hpos show that the order g of the polyadic 
group G, and the order h of its subgroup H, are respectively the same as the 
order of the ordinary group Go, and its subgroup Ho; and hence, h is a divisor 
of g. 

Since H generates H*, and in turn consists of the common elements of H* 
and G, the correspondence between the subgroups H of G, and their abstract 
containing groups H’%*, is 1-1. Ho consists of the common elements of H* and 
Go, and hence is also determined by H. In fact, we shall find useful the result 
that the products of m—1 elements chosen from a subgroup H of G constitute 
a subgroup of Go, namely Ho. On the other hand, different subgroups of G 
may have the same associated ordinary group Ho. Hence, in general, we can 
only say that the correspondence between the subgroups H of G, and their 
associated ordinary groups Hp, is but many-one. Furthermore, not every sub- 
group Hy of Gp need be the associated ordinary group of a subgroup H of G. 
The coset theorem and its converse, indeed, show that the necessary and suffi- 
cient condition that a subgroup Ho of Go be the associated ordinary group of some 
subgroup H of G is that there exist an element s of G such that Ho 1s invariant 
under s, while s™—' is in Hy. Indeed the subgroups of G are the distinct Hos’s 
obtained from all Ho’s and s’s satisfying this condition. 

As has been observed by Dérnte, two subgroups H and K of an m-adic 
group G need have no element in common. Thus, this will always be so if H 
and K are distinct subgroups of G with the same associated group. If, how- 
ever, H and K do have an element in common, their common elements clearly 
constitute a subgroup of each of the subgroups, if they are not identical with 
one or the other. Moreover, if s be such a common element, by writing 
H=Hys, K=Kos, we see that the associated group of the “crosscut” of H 
and K is the crosscut of their associated groups. 


il 


1940] POLYADIC GROUPS 223 


We consider next the expansion of G in cosets as regards a subgroup H 
thereof. Hy is clearly a subgroup of G*. We may therefore expand G* in say 
right cosets as regards Ho. Now it is immediately seen that such a coset of Ho 
either has no element in G, or is completely contained in G. For if this coset 
has an element s in common with G, then, since the coset can be written Hos, 
and H, is contained in Go, Hos will be wholly contained in G=Gyps. As all 
the elements of G must appear in the given expansion of G*, we see that the 
cosets in question containing elements of G constitute a separation of the ele- 
ments of G into mutually exclusive classes of elements. We may say then that 
G has thus been expanded in right cosets as regards H. A similar result holds 
for left cosets. 

And now an immediate generalization. In the above discussion H served 
only to introduce the subgroup Hp of Go. If then Hy be any subgroup of Go, 
whether it corresponds to a subgroup H of G, or not, the above argument 
holds without change. Hence, every subgroup of the associated ordinary group 
of a polyadic group leads to an expansion of the polyadic group in right cosets, 
and in left cosets, as regards that subgroup. 

Specifically, if in the expansion of G* in right cosets as regards Hy the cor- 
responding multipliers which are in G are Sa, Sg, -- - , Sx, then the expansion 
of G in right cosets as regards Hy can Le written 


G= Hosa + Hoss + + Hos,. 


Similarly for left cosets. A not easily proved theorem for ordinary finite 
groups is that the coset multipliers may be so selected that they are the 
same on the right as on the left: An immediate corollary of the preceding 
formulation is that the same is true of finite polyadic groups. 

It is sometimes necessary to consider the intersections of cosets in the ex- 
pansion of G in, say, right cosets as regards subgroups Ho, and Ko, of Go. 
We have then immediately that while a coset with respect to Hy and a coset 
with respect to Kp may have no elements in common, if they do have a com- 
mon element s, then their common elements constitute the set Los where Lo 
is the crosscut of Hy and Ko. In particular, if G is finite, all such intersecting 
pairs of cosets intersect in the same number of elements, namely, a number 
equal to the order of the crosscut of Hy and Ko. 

Expansions of G in double cosets likewise admit of simple treatment. With 
Hy and Kp» arbitrary subgroups of Go, we may expand G* in double cosets 
HorKo. If any element of such a double coset is in G, the entire double coset 
is contained in G. Hence, if in the expansion of G* we select those double co- 
sets with 7 in G, the result will be a separation of the elements of G* into 
mutually exclusive sets, that is, the expansion of G in double cosets as regards 
Hy and Ko. In particular, if G has subgroups H and K whose associated ordi- 
nary groups are Hy and Ko respectively, the resulting expansion may be 


4 
4 
d 
| 
dl 
| 
4 


224 E. L. POST ‘ [September 


spoken of as the expansion of G in double cosets as regards H and K, the case 
considered by Dérnte(*). 

We shall introduce the property of invariance through the more general 
concept of transform. To insure the fundamental correctness of our concept, 
we go back to first principles. Given an element s, and an 1-ad So, 
both considered in the m-adic sense, we define the transform of s under 
{s1, to be the element 


where sf, » Sales} is an inverse of {s1, Se, , 8i}. This fori<m—1; 
a similar definition holds for i=m—1. Since all inverses of a given polyad are 
equivalent, this transform is uniquely determined by s, and {s1, Sa, *, si}. 
Since inverses of equivalent i-ads are also equivalent, it follows that equiva- 
lent i-ads yield identical transforms of a given element. 

In saying s and {5), se, - - - , s;} are m-adic, we tacitly assume that there 
is some m-adic group to which s, s;, S2, --- , s; belong. Let us then consider 
the abstract containing ordinary group of this m-adic group, and treat it in © 
abstract form, with simplified notation. If, then, si} 
corresponds to abstract i-ad r of the containing group, the (m—i—1)-ad 
{s/,sf,---, will correspond to an abstract (m—i—1)-ad r’ such that 
if s be an element of the m-adic group, r’rs=s. Writing the identity of the 
containing group as usyal, we thus have r’r = 1, and hence in customary nota- 
tion, r’=r~—'. Consequently, if r represents a class of equivalent polyads of a 
polyadic group, r~—' represents the class of inverses of those polyads. The 
transform of s under { si, Sa¢ sy si} can now be written r—'!sr. And so, the 
transform of an element by an 7-ad is the ordinary transform of that element 
by the corresponding abstract i-ad in the abstract containing group. 

We can now extend our concept of transform to that of the transform of a 
polyad by a polyad. In general, via the abstract containing group, the trans- 
form of 7: by 12 is rz'7172. Had we resorted to our primitive concepts in this 
case, we would have, as with inverses, a class of equivalent transforms. We 
readily see that in all cases the transform of an i-ad, 1m —1, is an 1-ad. 

Consider now an m-adic group G, and an i-ad r not necessarily an t-ad 
of G. Then, as with ordinary groups, if each element of G is transformed by r, 
there results an m-adic group G’ which may be said to be simply isomorphic 
with G, and will be termed the transform of G under r. In fact, let s’ be the ° 
transform under r of any element s of G. Since r~'syr-r—'sor 
Smr, we see that the relationship - - Sm =Sm41 is equivalent 

() At first glance it would appear that Dérnte’s expansions in cosets and double cosets, 
while depending on actual subgroups of G, are more general than we have stated them to be. 
However, it is readily seen that Dérnte’s expansions with respect to a subgroup, or subgroups, 
of G are our expansions of G with respect to transforms, in the sense defined below, of the 


given subgroup or subgroups by polyads of G. And since these transforms are again subgroups 
of G, the Dérnte expansions are no more general than we have stated them to be. 


1940] POLYADIC GROUPS 225 


to s{S¢ +++ Sm =Sm41. The defining properties 1 and 2 for an m-adic group 
then follow immediately for the transform of G from the selfsame properties 
for G—hence the m-adic group G’. In general, two m-adic groups G and G’ 
may be said to be simply isomorphic if a 1-1 correspondence can be set up 
between their elements such that if s’ of G’ is the correspondent of s in G, 
then we will have, for all elements of G, 


[c(sise-- + Sm)]’ = c'(si sd Sm), 


cand c’ designating the m-adic operations of G and G’ respectively. For G’ the 
transform of G this is immediate with c and c’ the common unexpressed 
m-adic operation. 

We reserve a more detailed treatment of transforms for our study of finite 
polyadic groups, and turn to the question of invariance. An m-adic element, 
polyad, or group will be said to be invariant under an /-ad if it is transformed 
into itself by that t-ad. It will then be said to be invariant under an m-adic 
group if it is invariant under every polyad of that group. Since G* is gener- 
ated by G, it follows that for K to be invariant under G, it is sufficient that it 
be invariant under every element of G. If such a K is an element (subgroup) 
of G it will then be said to be an invariant element (subgroup) of G. Clearly, 
the condition that an m-group G be abelian is equivalent to each of its ele- 
ments being an invariant element of G. For, in the notation of the coset theo- 
rem, {s1, so} and { se, si} being equivalent becomes or, 
and conversely. 

Given an invariant subgroup H of G, the expansion of G in cosets as re- 
gards H immediately leads to an m-adic quotient group G/H. In fact, since H 
is invariant under G, it immediately follows that Ho, the associated 2-group 
of H, is also invariant under G; that is, Ho, as subgroup of G*, is invariant 
under each element of G considered as element of G*. For Hp consists of all 
products of m—1 elements chosen arbitrarily and independently from H. 
Hence the transform of Ho under any element s of G consists of all products of 
m—1 elements chosen arbitrarily and independently from the transform of H 
under s, that is, from H all over again. 

Consider then the expansion in cosets G=HosatHosst+ --- +Hos«. 
Then, exactly as in ordinary group theory, the coset in which the element 
$182 °° Sm appears depends only on the cosets containing the elements 
Si, So,°* +, Sm. If then oi, o2,:--, Om represent the cosets containing 
Si, S2,°**, Sm respectively, we may write the coset containing s1S2-- - Sm 
in the form oi02 - - - om. An m-adic operation is thus determined on these 
cosets as elements; and, again as in classic theory, these cosets constitute an 
m-adic group under this operation. We may therefore call this group the 
quotient group G/H. 

As we shall see later, m-adic quotient groups arising from invariant sub- 
groups are very special kinds of polyadic groups. However, Dérnte has em- 


| 
4 
q 
| 
q 
al 
| 


226 E. L. POST . [September 


phasized that m-adic quotient groups can arise in more general fashion. In 
our presentation, his argument reduces to the fact that the only use made of 
the invariance of subgroup H under G was to prove the invariance of Ho un- 
der G. We shall call a subgroup H of G whose associated 2-group Hp is in- 
variant under G a semt-invariant subgroup of G. It follows that every semi- 
variant subgroup of an m-adic group leads to an m-adic quotient group. 

This result can be made still more general. For we observed earlier that 
any subgroup H) of the associated 2-group Go of G gives rise to expansions 
in cosets. It therefore follows that every subgroup of the associated 2-group of 
an m-adic group which is invariant under the m-adic group leads to an m-adic 
quotient group. In the absence of a subgroup H of G we shall write this quo- 
tient group 

It is immediately seen that with Ho thus invariant under G, the right co- 
sets of G as regards Hp are identical with the left cosets. For s~-\Hos = Hp yields 
Hos =sHo. Conversely, if the right cosets of G as regards Hy are identical with 
the left cosets, then, for each element s of G, Hos =sHo, so that Hp is invariant 
under G. We thus see that the Dérnte concept of semi-invariance may be said 
to be the necessary and sufficient condition that a subgroup of a polyadic 
group give rise to a quotient group. Our extension, however, frees G from the 
need of possessing a subgroup H corresponding to the Hp invariant under G. 

In recent literature the concept of homomorphism appears as essentially 
equivalent to that of quotient group(*). By means of our coset theorem we 
readily show the same to be true for m-groups(*). As the analysis is not too 
immediate, we have refrained from explicitly using this concept except in the 
last section where it is especially needed. 

An m-group G with operation c may be said to be homomorphic to an 
m-group G with operation ¢ if there is a many-one correspondence between 
the elements of G and of G such that whenever si, sz, --- , Sm of G respec- 
tively correspond to 5m of G, c(siS2- ++ 5m) corresponds to 
é(5:52 - + - Sm). We first show that such a homomorphism between G and G 
determines a homomorphism between their abstract containing groups G* 
and G*. In fact, let i-ad r of G* be said to correspond to i-ad # of G* if there 
exist elements , s; of G, and corresponding elements 5, 52,---, 5; 
of G, such that r=c*(s,se - - - si), 51). It is readily seen that 
this sets up a correspondence between all the elements of G* and all the 
elements of G*. Furthermore, this correspondence is many-one. For sup- 
pose r of G* corresponds to #; and #, of G*. Then we must have r =c*(si52 - - - Si), 
= (552 - - - §;), and, also, r=c*(s{s?¢ with 
Si, Sa, of G corresponding to hi, 5! 
respectively of G. If then s of G corresponds to § of G, the equation. 


() See, for example, B. L. van der Waerden, Moderne Algebra, Berlin, 1930, vol. 1, §9. 
(78) Dérnte’s Theorem 8, §6, does the same for his more limited concept of m-adic quotient 
group under the assumption that the homomorph has at least one “first order element.” 


1940] POLYADIC GROUPS 227 


SiS S)=c(s{sf ---s{s--+-+ 5s), obtained from the two forms of r, 
yields 5:3 as a result of the homo- 
morphism between G and G. Hence #,=#. Finally, if r; and rz of G* thus cor- 
respond to # and #2 of G*, c*(r:72) corresponds to é*(7:72)—immediately, if 
and r2 are an 4-ad and j-ad respectively with 1+j Sm-—1, and via the homo- 
morphism between G and G if i+j>m-—1. The many-one correspondence 
between the elements of G* and of G* is therefore a homomorphism. 

The ordinary theorem on homomorphisms is therefore applicable, and we 
can state that the elements of G* corresponding to the identity of G* consti- 
tute an invariant subgroup A) of G*, while the elements of G* corresponding 
to any element of G* constitute a coset in the expansion of G* as regards Hp, 
the quotient group G*/H, being then simply isomorphic with G*. Since the 
identity of G* is an (m—1)-ad, Ho must consist of (m—1)-ads in G*, and is 
thus a subgroup of Gp invariant under G. Those cosets of G* as regards Hy 
which involve elements of G therefore constitute an expansion of G as re- 
gards H. Finally, the correspondence between G* and G* is but the original 
correspondence for elements of G and G. We thus have the following theorem. 
If m-group G is homomorphic to m-group G, there is an m-adic quotient group 
G/Hpo such that the correspondents of each element of G constitute a coset in 
G/Hpo, this quotient group then being simply isomorphic with G. Actually, as 
we have seen, H) consists of the elements of Go corresponding to the identity 
of Gp in the homomorphism between G* and G*, and hence between Gy and Gp 
determined by the given homomorphism. Since an m-group G is clearly homo- 
morphic to any m-adic quotient group G/Ho, the equivalence of the concepts 
of homomorphism and quotient group has been shown to hold also for 
m-groups. 

A homomorphism between m-groups G and G is thus always an (N, 1) 
isomorphism with fixed N, N of course finite for finite m-groups. A more 
immediate consequence of the given homomorphism is that it sets up a many- 
one correspondence between the subgroups of G and the subgroups of G, an 
m-group being considered now as a subgroup of itself. In fact, given a sub- 
group of G, the corresponding elements of G are readily seen to satisfy the 
conditions for an m-group, and thus constitute the uniquely corresponding 
subgroup of G. On the other hand, given a subgroup of G, the set of all corre- 
sponding elements of G constitutes a subgroup of G with the given subgroup 
of G as corresponding subgroup, and indeed, contains all such subgroups of G. 
Clearly this many-one correspondence between the subgroups of G and of G 
is preserved under the relation “subgroup of”—subgroup, in the above sense 
of group or subgroup. 

It is also readily verified that if the set G is not known to be an m-group 
under operation é, yet the remainder of the definition of homomorphism be- 
tween G and G is satisfied, then G is an m-group under é, and hence the 
given relation a genuine homomorphism. In fact, the only part of our defi- 


q 
{ 
4 
| 
al 
| 
i 


228 E. L. POST E [September 


nition of m-group not immediately given for G under é, as a consequence 
of its being satisfied by G under c, is the uniqueness of the solution of 
€(5:52 Sm) for $;, 17m. Passing by the considerations of the foot- 
note of §1 and a special argument valid only for G finite, we can in every 
case solve corresponding equations ¢(siS2 - - - Sm) =Sm4i for s; as in §2, with 
all s’s except s; and Sm41 fixed, and thus find that all such s’s must correspond 
to the same, consequently unique, §;(?”). 

Our converse of the coset theorem admits of immediate extension to the 
case of an m-adic quotient group. For the statement of this result we need 
the concept of order, when finite, of an element of an m-group as given in the 
beginning of §21. We may note now, however, that an element s may be said 
to be of first order if c(ss - - - s)=s, the unit class with sole member s then 
being a subgroup of the given m-group. We see then immediately that if an 
element of an m-adic quotient group is of the first order, the corresponding coset 
constitutes a subgroup of the given m-group. For the isomorphism between 
the given m-group and the quotient group shows that if in an equation 
c(SiS2 Sm) =Sm4, any m elements are in the coset, the (m+1)-st element 
must also be in that coset. Now consider any element @ of finite order k of 
the quotient group. Anticipating a concept of the next section, we may note 
now that our given m-group will constitute a polyadic group under the ex- 
tended operation with ~=k(m—1)+1. Our m-adic quotient 
group likewise extends to a u-group with the element o now being a first 
order element of the u-adic quotient group. The previous result therefore 
leads to the following. If an element of an m-adic quotient group is of finite 


order k, then the elements of the corresponding coset constitute a polyadic group 
under the operation of the given group extended to k(m—1)+1 elements. 

5. Reducibility. Given any ordinary group with class of elements C and 
dyadic operation s,s2, an m-adic group on the same elements will be deter- 
mined if we set up the m-adic operation c(siS2 - - Sm) =51S2 Sm. We shall 


(27) If a general isomorphism between m-groups G and G be defined as a many-many cor- 
respondence between their elements in which m-adic products of corresponding elements cor- 
respond, then, for finite m-adic groups, as for finite ordinary groups, the correspondence is 
that of a simple isomorphism between m-adic quotient groups of G and G. On the other hand, 
Dickson (these Transactions, vol. 6 (1905), pp. 205-208) has shown by an example that the 
finite group theorem does not hold for infinite groups, while Loewy (Festschrift Heinrich 
Weber, 1912, pp. 198-227) calls an isomorphism “vollstandig” if inverses of corresponding 
elements also correspond—the case when the finite group theorem does hold for infinite groups 
—and derives a number of interesting conditions for a genera! isomorphism to be “vollstandig.” 
In the case of infinite m-adic groups, the condition under which the finite m-adic group theorem 
goes over can be written in a variety of ways, but perhaps most symmetrically as follows. 
If in two equations ¢(siS2° * * Sm) =Sm41, €($1'S2’ * * * Sm’) =S'mai, m of the m+1 symbols in the 
first equation, and the m corresponding symbols in the second equation, represent elements 
of G and G respectively that correspond, then the elements represented by the remaining 
symbols must correspond. The writer is indebted to Reinhold Baer for the above references (as 
well as for the Neumann reference of §30). 


q 


1940] POLYADIC GROUPS 229 


call the m-group an extension of the 2-group, and say that it is reducible to 
that 2-group. Note that while the coset theorem presented an arbitrary poly- 
adic group in a somewhat similar light, the elements of the polyadic group 
formed but a proper subclass of the class of elements of the 2-group; whereas, 
when a polyadic group is reducible to a 2-group, the classes of elements are 
identical. 

More generally, given a u-group with class of elements C and opera- 
tion c,(sise---+5S,), if m is any number in the form k(u—1)+1 we can 
then form an m-adic group under the operation Cm(SiS2 Sm) =Cy(SiS2 Sm). 
As before, the m-group will be said to be an extension of the y-group, and re- 
ducible to the yu-group. 

An m-adic operation on a finite number of elements is most naturally ex- 
hibited by an m-dimensional table. We shall therefore say that an m-adic 
group is of dimension m. We then see that while a 2-group has an extension 
for each dimension m > 2, a u-group has an extension for those and only those 
dimensions m for which m—1 is a multiple of u—1. 

A given m-group will be said to be reducible to a u-group if there exists a 
u-group to which it is reducible. The m-group will be said to be irreducible if 
it is not reducible to a w-group for any un <m/(**). Dérnte has already given a 
necessary and sufficient condition that a polyadic group be reducible to a 
2-group. We proceed to generalize this result to reducibility to a u-group. 

A (u—1)-ad { will be said to be commutative with an 
element a if the y-ads { a, Gyr, a} and {a, a,-1} are 
equivalent. We then have the following basic theorem on reducibility. A nec- 
essary and sufficient condition that a given m-group be reducible to a y-group, 
m =k(u—1)+1, ts that there be a (u—1)-ad G2, @,-1} formed from ele- 
ments of the m-group such that the (u—1)-ad is commutative with every element of 
the m-group, and such that the (m—1)-ad { a1, G1, 
Qy-1,** * G2, is am identity of the m-group. 

The necessity of this condition follows immediately from the existence 
and properties of identities. For, if the m-group is reducible to a u-group, let 
{ a1, Q2,°°-, a,-1} be an identity of such a w-group. If c, is the operation of the 
M-group, Cy(@1de GyiS) =S=C,(Sdid2 - - for every element s of. the 
u-group. Hence { ai, Qe,°**, a,-1} is commutative with every element of the 
u-group, and hence, by the hypothesis of reducibility, with every element of 
the m-group. Furthermore, the (m—1)-ad { a1, Gut, G1, 
Qy-1, * a,-1} is an extended identity of the u-group, and hence 
an identity of the m-group, as was to be proved. 

As for the sufficiency of the condition, with { a, d2,°**, a,-1} as in the 


(28) “Echt” in Dérnte. Otherwise, “unecht” or “ableitbar.” 


| 
‘ 
| 
{ 


230 E. L. POST ; [September 


hypothesis, define the y-adic operation 
Cu(SiSe eee Sy) = Cm(S1S2 Qy~1). 


We proceed to prove that the elements of the m-group constitute a u-group 
under the operation c,, and that the given m-group is reducible to this 
u-group. Of the two conditions defining a polyadic group, condition 1 is 
satisfied by the proposed yu-group as an immediate consequence of its being 
satisfied by the given m-group. On the other hand, condition 2 for the u-group 
becomes 


eee ee ee Gy—1); 


which follows from condition 2 for the m-group, and the commutativity of 
{a1, a2, +--+, @,-1} with each element of the m-group. Hence the existence of 
the y-group. Finally, using extended operations, and applying the commuta- 
tivity part of our hypothesis, we will have 


Cu(S1S2 Sm) = Cm(S1S2 * SmQ12°** = Cm(S1S2 eee Sad, 


the second expression involving a sequence consisting of k(k—1) sequences 
Qy-1, which sequence, therefore, constitutes an extended identity of 
the m-group—since by hypothesis k such sequences constitute an identity. 
Hence the reducibility of the m-group to the yu-group follows. 

From the definition of c,, we see that the (u—1)-ad { ai, dz,***, a,-1} 
is indeed an identity of the resulting u-group. 

The above theorem may be used to prove a polyadic group irreducible, as 
is shown by the following simple illustration. The class of integers constitutes 
an infinite m-adic group under the operation s1+52+ -- +> +5n+1. Since the 
group is abelian, reducibility to a u-group with m=k(u—1)+1 is equivalent 
to the existence of an integer a such that ka+s+1=s, that is, ka=—1, 
which is impossible for any integral k>1. Hence the m-group is irreducible. 

The commutativity condition can be restated to read { a,-1} 
is invariant under the m-group. Since the present multiplicity of basic opera- 
tions makes us refrain from employing the simplifications of the coset theo- 
rem, the concept of invariance is preferable only for ~—1=1. Our (u—1)-ad 
is now a single element a; and the further condition that the (m—1)-ad 


: 


1940] POLYADIC GROUPS 231 


{a, a,-::, a} be an identity of the m-group may be restated to read: a is 
of first order. For this condition is equivalent to c,(aa - - - aa) =a. We may 
therefore state the special result, a rewording only of Dérnte’s, a necessary 
and sufficient condition that a given m-group be reducible to an ordinary group 
1s that the m-group possess an invariant element of first order. Our succeeding 
development will reveal many general classes of polyadic groups that can be 
proved reducible to 2-groups. One such class is already at hand, that is, all 
m-adic quotient groups arising from invariant subgroups of m-adic groups are 
reducible to 2-groups. For the element of the quotient group corresponding to 
the invariant subgroup is immediately seen to be invariant under the quotient 
group, and of m-adic order one. In this connection we may observe that semi- 
invariant subgroups also lead to special kinds of polyadic quotient groups, 
for the element corresponding to that semi-invariant subgroup must again 
be of first order. On the other hand, any polyadic group can be a quotient 
group in our most general sense; for, with Ho the identity of Go, G/Hp is 
identical with G. 

Given an m-adic group G, we may ask for the distribution of, and inter- 
relations between, the polyadic groups to which it is reducible. Note immedi- 
ately that if G is reducible to G’, and G’ to G’’, G is reducible to G’’, so that 
the class of groups to which G’ is reducible is a subclass of the class of groups 
to which G is reducible whenever G is reducible to G’. Our results are of two 
kinds, both derived from the above theorem. 

The first type of result is not much more than a restatement of the con- 
dition of the theorem. We recall that, if G is reducible to G’, the class of ele- 
ments of G is identical with the class of elements of G’, while the operation 
of G is an extended operation of G’. It follows that a class of equivalent i-ads 
of G is also a class of equivalent t-ads of G’, and conversely. In particular, 
the class of identities of G’ is a class of equivalent polyads(?*) of G, so that the 
classes of identities of two groups to which G may be reducible are either the 
same or mutually exclusive. 

When the classes of identities are distinct, the two groups in question will 
be distinct, as their operations cannot then be identical(*®). On the other 
hand, we easily see that when the classes of identities are the same, the groups 
are identical. For, if their operations arec’ and c’’, then, with { a1, °° 
an identity of each, we have 


(2°) By a class of equivalent polyads we mean a class of equivalent i-ads for some fixed 7 
While the elements of G* as first written are classes of equivalent z-ads with 15ism-—1, in 
general no such restriction is intended by the above phrase. As suggested in §2, by the use of 
extended operations the concept of equivalent i-ads becomes valid for 1>m—1. This observa- 
tion will be of greater importance later in the present section. 

() They may however be “abstractly the same” in the sense of being simply isomorphic. 
See the opening paragraph of §23. 


| 
| 
| 
| 
| 
| 
| 
| 


232 E. L. POST . [September 


c being an extended operation of each group. Observe finally that in the 
sufficiency proof of our basic theorem, and in the succeeding observation, 
if { a1, Qe,-**, a,-1} satisfies the given condition of that theorem, each 
(u—1)-ad equivalent to {a1, a2,---, a,-1} also does. We therefore can state 
the following result. There is a 1-1 correspondence between the groups to which 
a given m-adic group is reducible and the classes of equivalent polyads satisfying 
the condition of the basic theorem, each such class of equivalent polyads being the 
class of identities of the corresponding group. 

In particular, there are as many 2-groups to which an m-adic group is re- 
ducible as there are invariant elements of order one in the m-group(*'). Thus, 
consider an ordinary abelian group of finite order g. If d is any divisor of g, 
there are at least d elements a in this 2-group with a?=1. If this 2-group be 
extended to a (d+1)-group, each such element a is of order one in the (d+1)- 
group, and invariant therein. The (d+1)-group is therefore reducible to at 
least d distinct 2-groups, each such a, in fact, being the identity of the corre- 
sponding 2-group. 

Our second type of result concerns the possible dimensions of the groups 
to which a given polyadic group is reducible. The complete result is an im- 
mediate consequence of the following theorem. If an m-group 1s reducible to a 
Mi-group and a ws-group, it is reducible to a u-group where u—1 is the highest com- 
mon factor of and To prove this theorem let { G,/-1} 
and {a/', a/’,---, 4} be identities of the ui-group and pe-group respec- 
tively. They then satisfy the condition of our basic theorem. Furthermore, 
all but one of the letters in each can be chosen arbitrarily. 

If then ui >pe, we may assume aj - - , Consider then 
the sequence {a,,, a/,-1} which we shall write {a/”,---, with 
M3 —1=(ui1—1) —(ue—1). Then all but one of the letters of this sequence are 
arbitrary. Inductively, we thus obtain the sequence {a{’, -- -, with 
all but one letter arbitrary, from the sequence {a~”, - - - , a—,} and the 
smallest preceding sequence, easily seen to be unique. Clearly the process 
terminates when and only when y)-; is equal to the smallest preceding wy. 

Now in terms of the 4,—1’s, this process is nothing more than the Euclid 
algorithm for finding the highest common factor of u:—1 and we—1, where 
the process of division is replaced by the more primitive form of repeated 
subtractions. Hence, the above process terminates, and the last sequence 
found may be written {a1, - - - , d,-1}, where is the highest common fac- 
tor of 4i—1 and ue—1. We now prove that such a (u—1)-ad satisfies the 
condition of our basic theorem. 

First, the sequence {aj”’, - - - , al’,} is commutative with every element 


of the given m-group. For we have a,/-1} = {ai 


(1) In the case of abelian triadic groups this reduces to a theorem of Lehmer’s. 


‘ 
H 
Se 
4 


1940] POLYADIC GROUPS 233 


1S 144 Sm) = "bd 18101" Opn 1S 141 Sm). 
Hence, by induction, each {a?, - - - , a&_,} is commutative with every ele- 
ment of the m-group, and so {a:,--~-, @,-1} also is thus commutative. 

As for the second part of the condition, clearly m—1=k(u—1) with in- 
tegral k. As in the commutativity argument, and with the commutativity 
property, we obtain from the extended identities consisting of k sequences 
{av,---, a/,1} and k sequences laf ‘,+++, an extended identity 
consisting of k sequences {a/’,---, ati!s}. By induction, k sequences 
{a,---, a®_,} constitute an extended identity for every \, and hence 
the same is true of k sequences {a:,---, @,-1}. But, since k(u—1)=m-—1, 
the last is indeed an identity of our given m-group. {a,---, a,-1} there- 
fore satisfies completely the condition of our basic theorem, whence the pres- 
ent result. 

It follows that if yo is the least dimension of the groups to which a given 
m-group is reducible, all other dimensions yu of such groups must be such that 
u—1isa multiple of uo—1. We shall call yo the real dimension of the m-group, 
with, of course, uo =m if the group is irreducible. Since every 1—1 must also 
be a divisor of m—1, we easily obtain the following solution of the problem 
of the distribution of the dimensions of the groups to which a given polyadic 
group is reducible. Jf a group of dimension m has real dimension jo, and we 
write m—1=ko(uo—1), then the dimensions of the groups to which the m-group 
ts reducible are those and only those numbers wu for which »-1=k(uo—1), ka 
proper divisor of Ro. 

While this result justifies the term real dimension on the basis of a mere 
enumeration of distinct dimensions, other considerations show that an m- 
group in general, even if reducible, must still be considered an m-group. We 
have already given an example which shows that the same m-group may be 
reducible to different groups of the same dimension, and, indeed, of the real 
dimension of the m-group. We now further observe that an m-group may be 
reducible to an irreducible group of higher dimension than the real dimension 
of the m-group, that is, not every succession of reductions of a group need 
lead to the real dimension of the group. If we call the dimensions of the irre- 
ducible groups to which a polyadic group is reducible the irreducible dimen- 
stons of the given group, the real dimension of the group is only the smallest 
of its irreducible dimensions. 

In contrast with the class of groups to which an m-group is reducible, the 
class of extensions of an m-group is of very simple structure, since it has one 
and only one group of each dimension y» with »—1 a multiple of m—1, and 
no others. Of course, the reason is that extension is the direct process, reduc- 
tion indirect. We now combine these processes to yield the concept of derived 
group. 

Given an m-group G, a polyadic group G’ will be said to be derivable from G 
if it can be obtained from G by a finite succession of extensions and reductions. 


{ 

if 

q 

| 


234 E. L. POST . [September 


The class of all polyadic groups derivable from a given polyadic group will be 
called a net of polyadic groups. From this definition we see that each group of 
a net yields that net. Furthermore, all groups of a given net have the same 
class of elements; only the operations differ. 

The concept of a net of polyadic groups is considerably simplified by the 
following result. Any group of a net can be obtained from any other by a single 
extension followed by a single reduction. A single extension or a single reduction 
can obviously be replaced by an extension followed by a reduction. Since two 
successive extensions are equivalent to a single extension, two successive re- 
ductions to a single reduction, our result will follow if we can show that a 
reduction followed by an extension is equivalent to an extension followed by a 
reduction. Let then G’ with operation c),, be reducible to G’’ with operation 
chin, and let G’’ be extended to G’’’ with operation c}/+. With the above sub- 
scripts designating dimensionality, we have m’—1—k’(m’’—1), 
k’’(m''—1). Now cj, and cj} are both extensions of operation cj”. If then we 
extend to an operation with will be an 
extension of both c/,, and cj//. The corresponding group G"Y is then reduci- 
ble to both G’ and G’’, whence our result. 

Stated otherwise, given any two groups of a net there is a third group of the 
net reducible to each of the given groups. We could therefore redefine a net as 
the class of groups to which the extensions of a given group are reducible, 
though the conclusion that a net does not depend on the particular group in 
it chosen as the given group is then not immediate. 

The two types of results referred to in the case of the groups to which a 
given group is reducible now easily lead to corresponding results for the net 
of groups derivable from a given group. In this connection, a (u—1)-ad 
{ai, d2,+*-, a,-1} of an m-group will be said to be of finite order if some 
polyad of the form { a1, * * , de, ***, G1, a,-1} 
is an extended identity of the m-group. We then easily prove the following. 
There is a 1-1 correspondence between the groups of the net of groups derivable 
from a given group and the classes of equivalent polyads of finite order which are 
commutative with every element of the given group, each such class of equivalent 
polyads then being the class of identities of the corresponding group(®). In fact, 
the above redefinition of a net immediately yields a many-one correspondence 
of the above type, which is then seen to be one-one due to any pair of groups 
of a net being in the class of groups to which a third is reducible. 

Actually, it is easily verified that each of the concepts: class of equivalent 
polyads, commutative with every element, and even polyad of finite order, 
is independent of the particular group of the net chosen as given group, so 
that the above result can be restated in terms of the net alone. It is also easily 
proved that for finite polyadic groups every polyad is of finite order, so that 


(*) Here, as elsewhere, “group” unqualified means polyadic group. 


. 
4 


1940] POLYADIC GROUPS 235 


in such cases the corresponding condition need not be explicitly stated. In 
particular, there are as many 2-groups in the net as there are invariant ele- 
ments of finite order, and hence, for finite polyadic groups, as many as there 
are invariant elements. 

We pause to prove explicitly that the transform of one element of a group 
of a net by another is independent of the particular group employed. This 
will be so if true of any pair of groups, one reducible to the other. Since the 
operation of one of these groups is an extended operation of the other, an 
identity of the first group is an extended identity of the second; hence an 
inverse of an element in the first, an extended inverse of that element in the 
second, whence the identical transforms. 

The second type of result is obtained still more easily. We shall call the 
least dimension of the groups of a net their outer real dimension. The outer 
real dimension of a group is then always less than or equal to its real dimen- 
sion. Given an m-group G of outer real dimension p°, some third group G’ 
of the net will be reducible both to the m-group, and a group of dimension yp’. 
The real dimension of G’ will therefore exactly equal u°. As G’ is reducible to 
G, we see that m—1 isa multiple of 4°—1. That is, if the outer real dimension 
of an m-group is »°, then 4°—1 must be a divisor of m—1. 

Hence, also, all the groups of the net have dimensions w with uw—1 a 
multiple of 4°—1. Since, from a group of dimension p°, mere extensions yield 
groups of all such dimensions, we have the following main result. If the outer 
real dimension of the groups of a net is p°, their dimensions are those and only 
those numbers for which 

The first type of result is easily restated to yield a criterion for determin- 
ing the outer real dimension of a group. In particular, the outer real dimension 
of a group is 2 when and only when it contains an invariant element of finite 
order. Thus, a finite abelian polyadic group is always of outer real dimension 
2, and so is derivable from a 2-group, while a group having no invariant ele- 
ment is always of outer real dimension greater than 2. The existence of the 
latter type of group is peculiar to polyadic theory. A simple example is fur- 
nished by the class of odd substitutions of the symmetric group of degree 
three. By the converse of the coset theorem they form a triadic group of 
order three under the product of three substitutions as operation, and yet in- 
volve no invariant element. The three elements, incidentally, are all of first 
order in the triadic group. 

As in the case of mere reducibility, we shall call the dimensions of the 
irreducible groups of a net the outer irreducible dimensions of each group in 
the net. By contrast, a dimension will be said to be a reducible dimension of 
the groups of the net if there is at least one group of the net of that dimension, 
while all such groups are reducible. While we have no general theorem giving 
the distribution of these dimensions, the following special results lend a cer- 
tain insight into the possibilities involved. 


g 

if 
if 


236 E. L. POST : [September 


First, a group may have its real dimension as its only outer irreducible 
dimension. This is readily proved to be so for any 2-group which has no in- 
variant element other than the identity. In this case, in fact, the net of groups 
consists only of the 2-group, and its extensions. 

By contrast, a group may have an infinite number of outer irreducible 
dimensions. Thus it can be shown that for the ordinary cyclic group of order 
two the outer irreducible dimensions are the infinite set of numbers of the 
form 2"+1, m=0,1,2,---. 

Finally, it can be shown that every finite polyadic group has an infinite 
number of reducible dimensions. To be specific, if an m-group has g ele- 
ments, there is, of course, at least one group of the net of dimension 
(kg +1)(m—1)+1, for each k=1, 2, 3,---, and every group of the net of 
such a dimension is reducible, reducible to dimension m, in fact. 

We append a brief discussion of the generalization of the concept of a net 
of groups that arises from a consideration of the subgroups of a group. Let 
the complex of groups obtainable from a given polyadic group be the class of 
all polyadic groups obtainable from the given group by finite successions of 
the three operations “extension of,” “reduction of,” and “subgroup of.” It is 
readily verified by means of the very concepts involved that an extension of 
a subgroup of a group is also a subgroup of an extension of a group; and that 
a subgroup of a reduction of a group is also a reduction of a subgroup of the 
group. It follows that any group in a complex can be obtained from the given 
group by an operation of the single form “extension of” followed by “subgroup of” 
followed by “reduction of” if not merely by “extension of” followed by “reduc- 
tion of.” 

In the case of abelian groups we further have that a reduction of a sub- 
group of a group is also a subgroup of a reduction of the group, a result ob- 
tainable with the help of our criterion of reducibility. It follows that the 
complex of groups obtainable from an abelian polyadic group consists of the 
groups in the corresponding net of groups, and their subgroups. That this is 
not true for all complexes can be seen from the case of a group with a first 
order element, but no invariant element. For the first order element consti- 
tutes a subgroup of the given group reducible to a 2-group; while, the outer 
real dimension of the given group being greater than 2, the dimensions of all 
the groups in the net, and hence of their subgroups, is greater than 2. 

It is readily seen that the groups of a complex whose classes of elements 
are the same as that of the original group constitute the net of that group, 
or, as we shall now phrase it, the net of the complex. Clearly the net of a 
complex also consists of all of its groups from which that complex is obtain- 
able. On the other hand, a group of a complex with class of elements a proper 
subclass of that of the original group will yield a complex which is a proper 
subclass of the given complex, and may be called a subcomplex thereof. If we 
call the nets of the subcomplexes of a complex the subnets of that complex, 


it 


1940] POLYADIC GROUPS 237 


then it is clear that the net and subnets of a complex constitute a separation 
of the groups of the complex into mutually exclusive sets. 

The relationship between the subcomplexes of a complex is in part fur- 
nished by the following result. If of two groups in a complex the class of elements 
of the first group is contained in the class of elements of the second, then the first 
group is in the complex obtained from the second. For consider the two groups 
to be obtained from an initial group according to our first result. Using (cm, C) 
to designate a group with m-adic operation c,, and class of elements C, we 
may indicate the process as follows: 


(Cm, C) —> (Crm, C) —> (Cm, C’) C’), 


The two groups in the second column are also reductions of a third group 
(civ, C). Since the third column symbolizes groups, it follows that (civ, C’) 
and (cv, C’’) are groups; and as C’ is contained in C’’ by hypothesis, 
(civ, C’) is a subgroup of (chy, C’’), if not identical with it. Now (cyyv, C’), 
(ch, C’) and (cj, C’) are in a single net of groups, as are also (cjyv, C’’), 
(cp C’’) and (chy, C’’). Hence C’) is in the complex obtainable from 
(cmv, C’’), as was to be proved. 

A particular application of the above result is the following. Any two 
groups of a complex which have the same class of elements are derivable from 
each other, that is, belong to one and the same net. It follows that there is a 1-1 
correspondence between the subnets, including the net, into which the groups 
of a complex were separated, and the different classes of elements of the 
groups in the complex. 

Hence also, or directly from our general result, there is a 1-1 correspond- 
ence between the subcomplexes, including the complex, of a complex, and the 
different classes of elements of the groups in the complex, each complex being 
obtainable from those and only those groups whose classes of elements are 
identical with the class of elements corresponding to the complex. Moreover, 
our general result shows that one subcomplex contains a second when and 
only when the class of elements corresponding to the first contains the class 
of elements corresponding to the second. We now complete this picture by 
proving the following. If two subcomplexes K' and K"’ of a complex correspond 
to the classes of elements C’ and C’’, then the logical product of K’ and K"’, null 
when the logical product of C’ and C"’ is null, is otherwise a complex, namely 
the complex corresponding to the logical product of C’ and C’’. For C’ and C’’ 
must be the classes of elements of two groups (c,, C’) and (civ, C’’) of the 
complex. In the notation of the previous proof, (civ, C’) and (chy, C’’) are 
then groups of the complex. If then C’’’, the logical product of C’ and C’’, 
is not null, (civ, C’’’) isa group of the complex. The case C’’’ null is immedi- 
ate. Otherwise, then, there will be a subcomplex K’’’ corresponding to C’’’. 


| | 

). 
‘ 

if 
3 
; q 

Wi 


238 E. L. POST . [September 


Our earlier result then shows immediately that a group G is common to K’ 
and K”’ when and only when it is in K’’’. 

Further results on the subcomplexes of a complex obtained from a finite 
polyadic group, and more particularly a finite abelian polyadic group, will be 
found at the end of §22, our second section on cyclic polyadic groups(**). 

6. Arbitrary containing ordinary groups. The coset theorem led to the ab- 
stract containing ordinary group G* of an m-group G merely by a considera- 
tion of G treated abstractly. Often, however, the elements of G may immedi- 
ately be given in such a form that the m-adic operation is but an extension 
of a more primitive dyadic operation, as when G is an m-adic group of ordi- 
nary substitutions. In such a case a containing 2-group arises directly, and 
may be more useful than the abstract containing group. 

A 2-group G*’ will be called a containing group of an m-group G if the ele- 
ments of G are among the elements of G*’, the operation of G an extension of 
the operation of G*’, while G*’ is generated by the elements of G. In what 
follows we simultaneously investigate the possible structure of G*’, and its 
relationship to G*. We must therefore explicitly distinguish between their 
operations c*’ and c* respectively (*). 

Let two polyads and {sf, So, } of G lead to iden- 
tical products in G*; that is, let c*(sise - s;) =c*(s{s? -- - s;’). Since 
must then be a multiple of m—1, we can annex elements s{’,---, s}’ 
of G, if need be, so that the resulting equation c*(sis2--- sjsi’ --+ s}’) 
=c*(s{sf can be rewritten sisi! + 
=c(s{s¢ ---s;/s{' ---+s}')in, perhaps, extended notation. But this equation 
can now be written c*’(sise-- 5;')=c*'(sisd ---s;/s{' +--+ s} ), 
whence we obtain c*’(sis2 - - s;) - -- s;’). That is, if two polyads 
of G lead to identical products in G* they lead to identical products in G*’. 
If then we let every element of the form c*’(sis2 - - - s;) in G*’ correspond to 
element c*(sise - - - s;) of G*, a one-many correspondence is set up between 
those elements of G*’ and of G* which are obtainable as products of elements 
of G. 

This correspondence is clearly preserved under the respective operations 
of these groups. For if 7; and r2 of G* correspond to rj and r/ respectively 


() The development of the section just ended, lengthy as it is, is probably but one of 
many possible developments leading to sets of related polyadic groups. Dérnte’s Theorem 7, 
§2, can probably be made the starting point for such a different development. The possibilities 
are further widened if a theory is contemplated which would include the relationship between 
a polyadic group and the corresponding “schar.” 

(*) It might be thought that now, when the ordinary group demanded by Milier’s theorem 
is immediately given, at least the structure of G*’ requires no further investigation. But, apart 
from the fact that Miller’s theorem is given for finite groups, his hypothesis that for some in- 
teger nm the products of any m but no fewer elements of G is in G is not immediately given, but 
is replaced by G’s being an m-group. As we also need the relationship between G* and G*’, we 
make our development entirely independent of Miller’s. 


4 
| 
> 
¥ 


1940] POLYADIC GROUPS 239 


of G*’, by writing these elements as corresponding products of elements in G 
we see immediately that c*(rire) corresponds to c*’(rj r7 ). Since G* consists 
of the products of elements in G, it easily follows that the products in G*’ 
of elements of G themselves constitute a group which can then be none other 
than G*’; for G*’ is generated by G. Furthermore our one-many correspond- 
ence, which is therefore a correspondence between all the elements of G*’ 
and of G*, is indeed a one-many isomorphism between G*’ and G*. 

For fixed 7 we shall call the set of elements of G*’ which are the products 
of 7 elements of G the ith coset of G*’. For these elements the above set of 
equations can be reversed so that our one-many correspondence between G*’ 
‘and G* becomes a 1-1 correspondence between the elements of the ith cosets 
of G*’ and of G* for each 121. From the corresponding result for G*, it follows 
that the elements of the ith coset of G*’ will be obtained in 1-1 fashion if in 
the expression c*’(s; - - - s;-15) we let 51, - - - , $;-1 be arbitrary fixed elements 
of G, and let s run through G. 

Let now k designate the least i for which the corresponding coset of G*’ 
contains the identity I’ of G*’. It follows, first, that the first k cosets of G*’ 
are mutually exclusive. For if we could have c*’(s; - - + s;)=c*'(s{ --+s}/) with 
1<1<jSk, then, byrewritingc*’(s/ - - -s/)intheformc*’(s: - - 
we would have c*’(s/,; - - s/) in contradiction to our definition of R. 
On the other hand, the (k+1)-st coset of G*’ is identical with the first, that 
is, with G, for we can write its elements in the form c*’(s;- - - s,s) with 
c*’(s; - - + s,) =I’. Hence also the (k+2)-nd coset is identical with the 2d, and 
so on. G*’ therefore consists of the elements of its first k cosets, while succeed- 
ing cosets are cyclic repetitions of these. In particular, the (m—1)-st coset 
must be identical with the kth coset. For if {s1, Sa,-**, Sm—} is an identity 
of G, c*’(siS2 - - - Sm-1) =I’, so that the (m—1)-st and kth cosets have an ele- 
ment in common. Hence & is a divisor of m—1. 

Returning to our correspondence between the elements of G*’ and of G* 
we see that it is 1-1 between the elements of G*’ and the elements of the 
first k cosets of G*, and of each succeeding set of k cosets of G*. Our one- 
many correspondence is thus actually [1, (m—1)/k], and we therefore have 
a [1, (m—1)/k] isomorphism between G*’ and G*. To complete our analysis 
we consider the analogue in G*’ of the associated 2-group Gp of G in G*. 

Our [1, (m—1)/k] correspondence is clearly 1-1 between the elements of 
the kth coset of G*’, and of Go, the (m—1)-st coset of G*. Since the product of 
two elements of the kth coset of G*’ is in the 2kth coset, and hence also in 
the kth coset, of G*’, the previous [1, (m—1)/k] isomorphism between G*’ 
and G* is simple between the kth coset of G*’, and Go. It follows that the kth 
coset of G*’ constitutes a group with operation c*’ simply isomorphic with Go. 
We shall call it the associated ordinary group of G in G*’, and symbolize 


it Gj. The same argument used in proving Gp invariant under G* shows Gj 
to be invariant under G*’. 


i 
{ 
i 
i 
if 
ie 


240 E. L. POST [September 


Since the ith coset of G*’ is given by c*’(s; - - - 5:15), with 51, +++, Sia 
fixed elements of G, s running through G, we can let si,---, Ss: be the 
same element sp of G, and write that ith coset s}>-'G in ordinary notation. 
It can likewise be written Gsj~’. We thus obtain the expansion G*’ =G+Gso 
+Gsp+ ---+Gs§~'. Since Gs}-'=GJ, and Gsk=G, we therefore have 


G = Gd So, 
while the above expansion becomes 


But this is the expansion of G*’ in augmented cosets as regards the invariant 
subgroup Gy , assuming Gg is not itself G*’. It follows that the quotient group 
G*’/G@ is of index k, while the element in that quotient group corresponding 
to G generates G*’/G/. 

This concludes our discussion of the structure of G*’. As for its iso- 
morphism with G*, observe first that in that isomorphism elements of G 
correspond to themselves. We then see that the isomorphism between G*’ and 
G* is determined by this partial correspondence provided k, and the element 
of the kth coset of G*’ which serves as the identity of G*’, are specified. For 
the correspondence between elements of G and themselves determines the 1-1 
correspondence between the elements of the ith cosets of G*’ and of G* for 
every 1. And given k, and c*’(s9s$ - - - s?) s’sin G, if 7=xk+1, 1S1Sk, the 
equation sf-- + s9s9- ssise - =c*’(sise- $1) serves to 
identify each symbolized element of the jth coset of G*’ with a unique ele- 
ment of the /th coset, and thus completes the correspondence between the 
elements of G*’ and G*. In particular, the simple isomorphism between G/ 
and Gp is also thus determined. We therefore have the following comprehen- 
sive theorem: 

Every containing 2-group G*’ of an m-group G, if not itself a 2-group Gd 
to which G is reducible, contains an invariant subgroup Gi of index k, with ka 
divisor of m—1, G a coset of G*’ as regards Gj , and the quotient group G*'/G¢ 
generated by the element corresponding to G. Furthermore, G*’ admits a 
[1, (m—1)/k] isomorphism with G*, the abstract containing 2-group of G, 
which reduces to a simple isomorphism between Gj and Go, the associated 2-group 
of G. This isomorphism makes each element of G correspond to itself, and is, in 
fact, determined by this correspondence when k, which is the smallest i for which 
an i-ad of G yields the identity of G*’, as well as the class of equivalent k-ads of G 
thus yielding the identity of G*’, are specified. 

We shall call k the index of the containing 2-group. We have then, in par- 
ticular, that any two containing groups of index m—1 of an m-group are simply 
tsomorphic, the isomorphism in question making each element of the m-group 
correspond to itself, and being in turn determined by this correspondence. Hence, 


1940] POLYADIC GROUPS 241 


any containing group of index m—1 of an m-group G may be considered to be 
the abstract containing group G* of G. 

We further have that any two containing groups of index 1 of an m-group 
are simply isomorphic. For the G*’’s are then also the Gj’s which are both 
simply isomorphic with Go. Observe, however, that the simple isomorphism 
now no longer makes elements of G correspond to themselves, or the G*’’s 
would be identical. In fact, a different element of G serves as identity in 
each G*’. Since G is now reducible to G*’, and conversely, we have as a corol- 
lary the following result on the 2-groups to which an m-group is réducible, 
and hence also on the 2-groups in a net. All 2-groups in a net of groups are 
simply isomorphic. 

Before considering the same question for two containing groups of index k, 
1<k<m-—1, we ask when an m-group will admit a containing 2-group of 
index k. We then easily obtain the following theorem. A necessary and suffi- 
cient condition that an m-group admit a containing group of index k, k<m—1, 
ts that the m-group be reducible to a (k+1)-group. In fact, the observation 
that in a containing group G*’ of index k the products of k+1 elements of G 
must be in G is easily extended to show that the elements of G constitute a 
(k+1)-group under the operation c*’(sis2 - - - Sk41). AS k is a divisor of m—1, 
the operation - - Sm) =c*’(siSe - Sm) isan extension of c*’(siSe - - Sk41), 
and, consequently, G is reducible to the corresponding (k+1)-group. Con- 
versely, if G is reducible to a (k+1)-group, the abstract containing group of 
the (k+1)-group is of index k. But this group is clearly also a containing 
group of G, and of index k. In particular, an irreducible m-group admits con- 
taining groups of index m—1 only, and conversely. Hence, the abstract contain- 
ing group of an irreducible polyadic group may be said to be its only contain- 
ing group. 

This relation to reducibility shows that there are as many essentially dif- 
ferent containing groups of index k<m-—1 of an m-group G as there are 
(k+1)-groups to which G is reducible. Hence when 1<k<m-—1, as when 
k=1, two essentially different containing groups of index k will not admita 
simple isomorphism which makes each element of G correspond to itself, since 
the classes of equivalent k-ads yielding their identities will be different. More- 
over, unlike the case k=1, they need not even admit a simple isomorphism 
which transforms the class of elements of G into itself. For our example of a 
group having an infinite number of outer irreducible dimensions easily leads 
to a group G reducible to two groups G; and G2 of the same dimension, one 
reducible, the other irreducible. The abstract containing groups of G, and G2 
are containing groups of G of the same index; and did they admit a simple 
isomorphism of the type in question, G, and Gz would be simply isomorphic, 
and hence could not be one reducible, the other irreducible. 

Finally, a word about the application of arbitrary containing groups of an 
m-group to the study of the m-group. With the containing group G*’ specified, 


f 
i 
i 
+ 
imi 
f 
. 4 
{ 


242 E. L. POST [September 


we may use ordinary notation for its operation, and write the operation of the 
m-group G, SiS2-- Sm. If s is an element of G, and {s’, s(m—2) } an 
inverse of s, we shall still have s’s’’ - - - s™-2) =5—1, It follows that the trans- 
form of an element s, of G by an element s: will be given by sz1s,s2 no matter 
what the containing group. Likewise, if s’s’’ - - - s“ =r in G*’, the transform 
of element s of G by the t-ad {s’, s’’,-++, s@} will be given by r-'sr. On 
the other hand, the fact that, for an element s of G, s~', in a containing group 
of index k<m-—1, can also be written as a product of kR—1 elements of G, 
or as an element of G when k=1, gives but spurious information about the 
corresponding (k—1)-ad, or element, of G. 

7. Determination of all types of semi-abelianisms. In the notation of the 
abstract containing group of an m-group G we may write the condition that G 
be abelian in the form s,52= 525; for all elements s;, sz of G. Dérnte discovered 
that an m-group, m>2, may satisfy the weaker type of commutativity prop- 
erty $1S2° Sm—1Sm=SmS2 * without necessarily being abelian, and 
termed such groups semi-abelian. An immediate generalization of the 
Dérnte type of semi-abelianism is that given by any relation 


where u—1 is a divisor of m—1, s’s arbitrary elements of G. We shall then 
say that G is y-semi-abelian. At least a trivial example of an m-group that is 
w-semi-abelian, but not abelian, would be given by the extension to an m- 
group of a non-abelian w-group semi-abelian in Dérnte’s sense. 

In general we shall say that an m-group G is semi-abelian according to the 
corresponding formal type if for all choices of s’s as elements in G a set of 
relations of the following form are satisfied : 


192 l 


each right-hand member being a specific permutation of the left, not all the 
permutations being the identity. Given m, two such formulations will then 
be said to be equivalent, or to define the same type of semi-abelianism if all 
m-groups which are semi-abelian according to one formal type are also semi- 
abelian according to the other. We proceed to prove that the above extensions 
of the Dérnte type of semi-abelianism constitute all possible semi-abelian- 
isms. More specifically, by the displacement of a letter in a given equation 
of a set of the above type we shall mean the number of places, right or left, 
it has to be moved in passing from the left side of the equation to the right. 
We then prove that every formal type of semi-abelianism, for given dimension m, 
is equivalent to u-semt-abelianism with u—1 the highest common factor of m—1 


= 

= 


1940] POLYADIC GROUPS 243 


and all the displacements of the letters in the equations defining the semi-abelian- 
ism. 

Observe immediately that for m=2 there is no semi-abelianism distinct 
from abelianism. For in some equation a pair of letters s;, s; will appear in 
different orders on opposite sides of the equation; and by replacing all other 
letters by the identity we obtain the condition for abelianism s;s;=s;s;. This 
serves to make plausible our general result, and to give a hint of its proof. 

In the general case, then, let G be any m-group semi-abelian according to 
a given formal type, and let some letter s; have a nonzero displacement & in 
one of the equations defining that semi-abelianism. Since re-symbolization al- 
lows either member of the equation to be written first, we may write the equa- 
tion 


so that 


The first bracket is equivalent to some k-ad s’s’’ - - - s“., Since at least one 
letter inside that bracket and outside the parenthesis must be different from 
all the letters in the parenthesis, that k-ad, and hence s’, s’’,---, s™, can 
be arbitrary. The second bracket is equivalent to some x-ad §/5’’- - - 3, 
We can always assume «x >.1, by introducing an identity if need be, and hence 
at least 3“ is arbitrary. That is, for every s’, s’’,---, s™, 5, we can find 
5’, + + , so that 


for every s;. Letting s;=s’, we find that s’’- - - 
whence 


Letting s;=5), we find s’s’’ - - - =1, whence 


It follows that for every s’, s’’,---,s, 5 in G, 


Dropping momentarily the condition u—1 a divisor of m—1 in our definition 
of u-semi-abelianism, we have therefore proved that for each displacement 
k>0, G is (k+1)-semi-abelian. 

Let now G be (k,+1)-semi-abelian and (k2-+1)-semi-abelian. We then 
prove that G is (k+1)-semi-abelian with k=H.C.F.(ki, ke). This will follow 
if for every such k; and ke with ke>k:, G is (ks+1)-semi-abelian with 
ks = ke —k;. But under our hypothesis, with all other letters unmoved, we have 


i] 
i 
i 
| 


244 E. L. POST [September 


Finally, we show that if the m-group G is (k+1)-semi-abelian, it is also 
(k’+1)-semi-abelian with k’=H.C.F.(k, m—1). Since G is (k+1)-semi- 
abelian, it is also (kk+1)-semi-abelian for every positive integral x. It is 
therefore also (k’’+1)-semi-abelian with k’’ any positive integer in the form 
xk—d(m—1). For in the equation defining the (xk+1)-semi-abelianism there 
are at least A\(m—1) letters between the first and last letters of each member; 
and by choosing A(m—1) of these letters consecutively to form an extended 
identity the desired (k’’+1)-semi-abelianism is revealed. As positive integers 
k and X can always be chosen so that xk—A(m—1)=H.C.F.(k, m—1), our 
result follows. 

From these three special results it follows that every m-group possessing a 
given formal type of semi-abelianism is y-semi-abelian with yu as in the state- 
ment of our theorem. It remains to be shown that every m-group that is 
u-semi-abelian also satisfies the given formal semi-abelianism. For each of the 
given equations separates the letters in the left side of the equation into u—1 
mutually exclusive sets such that each set consists of all letters whose “dis- 
tance” from a given letter is a multiple of u.—1. Since in passing from the left 
side to the right side of the equation each letter suffers a displacement itself 
a multiple of u—1, the result is to permute the letters of each set among 
themselves. Now a single application of our hypothesis of u-semi-abelianism 
to the left side of the equation in question constitutes a transposition of two 
letters in the same set. As yu-semi-abelianism implies [x(u—1)+1 ]-semi- 
abelianism, every such transposition can be effected. And, as any substitu- 
tion is the product of transpositions, successive applications of our hypothesis 
of semi-abelianism will transform the left side of each equation so that each 
of its 1—1 sets assumes the form it has on the right. That is, each equation 
of the given formal semi-abelianism will be satisfied by the elements of any 
m-group that is u-semi-abelian. The equivalence in question has therefore 
been demonstrated. 

That u-semi-abelianism is a different type of semi-abelianism for differ- 
ent divisors 1—1 of m—1 is readily proved by examples. By the theorem of 
the next section, an m-group G=Go5» will be determined by the following 
hypothesis: Gp an ordinary cyclic group of order 2"-!1—1 generated by ¢, 
sp '=1, si tso=t?. Since Go is abelian, the first result of the next paragraph . 
shows G to be m-semi-abelian. Now a similar argument shows an m-group G 
to be u-semi-abelian, 4 —1 a divisor of m—1, when and only when the (u—1)- 
ads of G are commutative with the (m—1)-ads of G. Since s?~* is the first 
ordinary positive power of s» commutative with ¢, it follows that G is not 
u-semi-abelian for any divisor u—1 of m—1 other than m—1. Now let w.—1, 
Me—1 be any two distinct divisors of m—1 with, say, 41 >pe. By the preceding 
method construct a w:-group G’ which is p:-semi-abelian, but not us3-semi- 


i 
| 
| 


1940] POLYADIC GROUPS 245 


abelian for any divisor 43—1 of u:—1 other than w,—1. The extension of G’ 
to an m-group G’’ then has the same property. It then follows that the 
m-group G’’ while y1-semi-abelian is not pe-semi-abelian, since otherwise it 
would be ys3-semi-abelian with y3—1=H.C.F.(ui:—1, we—1), and thus a di- 
visor of other than The m-group G”’ thus shows y:-semi-abelian- 
ism to be not equivalent to u2-semi-abelianism whenever ui ~ ue. Coupled with 
our previous theorem it yields the following result. There are as many distinct 
types of semi-abelianism for m-adic groups as there are distinct divisors of m—1. 

In what follows we restrict our attention to ordinary, that is, m-semi- 
abelianism, a property implied by any type of semi-abelianism. Since the as- 
sociated ordinary group Go of an m-group G consists of the products of m—1 
arbitrary elements of G, the condition that G) is abelian is a condition of semi- 
abelianism on G of formal type - 


5152 * Sm—1SmSm41° * * Sam—2 = SmSm+1° * * Som—251S2 * Sm—1. 


As each letter suffers a displacement m—1, by our general result this type 
of semi-abelianism is equivalent to m-semi-abelianism. Hence, every semi- 
abelian m-group has an abelian associated group, and conversely. If an element s 
of a semi-abelian group G is invariant under G, it is also invariant under Go, 
and hence G=Ggs is abelian. That is, if a semi-abelian m-group is non-abelian, 
it has no invariant element. If s; and sz are any two elements of semi-abelian G, 
t any element of Go, then, since s;=?’se, with t’ in Go, and since ¢ and ¢’ are 
commutative, we have sj 'ts,;=sz tse. Hence, all the elements of a semi-abelian 
m-group G transform an arbitrary given element of the associated group Gy into 
the same element. Now let H be any subgroup of semi-abelian G. Its associated 
subgroup Jp is then invariant under any element s) of H. But every element s 
of G transforms the elements of Hy as does so. Hence Hp is invariant under G. 
That is, every subgroup of a semi-abelian group 1s semi-invariant(*). 

8. On the construction of polyadic groups. We proceed to prove the fol- 
lowing general theorem on the construction of abstract polyadic groups re- 
ferred to in connection with the converse of the coset theorem. Given any 
abstract 2-group Go to serve as associated group, an abstract element so subject 
to the condition st'~'=to, to in Go, and any automorphism T of Go, which carries 
to into itself, and whose (m—1)-st power is the automorphism of Go under to, 
to serve as the automorphism of Go under So, then there is one and only one corre- 
sponding abstract m-group G; conversely every m-group can be thus determined(*). 


() See Dérnte’s §7 for quite a different set of properties of semi-abelian groups. Dérnte’s 
result that a triadic group consisting of first order elements only must be semi-abelian is 
equivalent for finite groups to a result of Miller’s as a consequence of the above equivalence of 
the semi-abelianism of G, and abelianism of Go. By introducing the polyadic groups G; of our 
§34 to take the place of Gy in the discussion of the last paragraph, the results of that paragraph 
can be specifically generalized to yu-semi-abelianism. 

(*) After this theorem was obtained by the writer, a closely related result was published 
by Turing as an illustration of a more general theorem in the theory of group extensions. (Not 


‘ 
{ 
. | 
| 
ber 
| 


246 E. L. POST [September 


For the second part of this theorem note that given an m-group G, and 
any Soin G, Go, to, and T are determined, and obviously satisfy the conditions 
of the theorem. It follows from the first part of the succeeding proof that G 
is determinable as stated. 

We turn then to the first part of the theorem. For purposes of analysis, 
consider the coset representation of a hypothetical G satisfying the given 
conditions. We would then have G=Gpospo. If we write the elements of Go as ¢;, 
we may correspondingly symbolize the elements of G by s;, with s;=#,59. Of 
course Sj must then be identified with that s; for which ¢; is the identity of Go, 
while fo will appear as some ¢;. We must then have, for the operation of G, 


so that c(s;,5;, + + + S:,,), and with it G, if it exists, is completely determined by 
our hypothesis. 

We next prove that the elements s;=/;59 actually constitute an m-group 
under this operation. As to condition 1 of the definition of an m-group, given 
C(Si:Sig * * * Sim) =Simy, With all s’s but s;; specified members of G, we corre- 
spondingly have ¢;,-T—%4;, - - - with all elements specified 
members of Go with the exception of ¢;,,,,, when j=m+1, T-“-%t,,, when 
j#m-+41. In the first case, a unique #;,,,, in Go, and, hence s;,,,, in G, are im- 
mediately determined..In the second case, a unique 7~—“—#;; in Go is deter- 
mined, hence again ¢;; in Go, and s;; in G. As for condition 2, we have 


the last since T~-“-t9=¢o, and to-t= T-“—t-to, by our hypothesis. The re- 
sult is thus independent of j, whence follows condition 2. 


It remains to be shown that the m-group G thus obtained actually rede- 
termines, via So, the Go, to, T of the given hypothesis(*’). From the operation c 


to be confused with our polyadic concept of §5. See A. M. Turing, The extensions of a group, 
Compositio Mathematica, vol. 5 (1938), pp. 357-367.) From this point of view, the abstract 
containing groups of m-groups with given Go are the extensions of Go by the cyclic group of 
order m— 1. Our theorem on the determination of Gcould then have been based on the determin- 
ation of G* as cyclic extension of Go. The theorem on cyclic extensions thus envisaged would be 
not quite Turing’s (Theorem 5, loc. cit.), but equivalent thereto by the identification of our T 
with his to with 

(87) In connection with the preceding footnote it must be mentioned that this part of the 
proof was overlooked by the writer until the final check-up on the entire paper. 


3 
‘ 


1940] POLYADIC GROUPS 247 


as given, and again with the aid of the relation T~“~t;,, -to=to-t;,,, we see 
that equivalent (m—1)-ads {5;,, , , Si,—,} are those for which the cor- 
responding elements #;,- 7—'t;, - - - T—‘"~t;,,_,-to of the given 2-group Go are 
the same. If then we represent the elements of the associated 2-group of 
G thus by the elements of the given group Go, and determine the operation 
of this associated group via [{si,, Sis Simi} Sin Sint] 
= [{c(siSig- Sin Sima} bracket meaning class of (m—1)- 
ads equivalent to the specified (m—1)-ad, we find this operation, again with 
the help of the above relation, to be identical with the operation of the 
given 2-group. That is, abstractly, the given Go is the associated ordinary 
group of G. Since, for the s;=so, t; is the identity, we immediately have 
ss [{so, so} ]=¢o in the above representation. Finally, by intro- 
ducing identity, hence inverse, and thus transform, in their original polyadic 
form, it can likewise be shown that if the elements of Go are transformed by so 
the resulting automorphism of Gp is T. Therefore, the proof has been com- 
pleted. 

We have already used the converse of the coset theorem in giving an ex- 
ample of a 3-group of order three having no variant element. This 3-group 
can now be given abstractly in accordance with the above theorem. For Go, 
take the cyclic group (1, ¢, #2). Let s3=1, and let the automorphism T of 
Go be ¢?) = (1, 4). Our hypothesis is verified, thus giving us a 3-group 
(So, tSo, £259) of order three. We obtain directly = #250, sv 
$9 4t?soso=tso, proving that none of the three elements of the 3-group are in- 
variant under the 3-group. 

This theorem may be used to determine all finite abstract polyadic groups 
of given small order. In this connection we have as an immediate conse- 
quence of the preceding theorem the following. A necessary and sufficient con- 
dition that two m-groups G' and G"' be simply isomorphic is that a simple 
isomorphism can be set up between their associated 2-groups Gj and Gj’, and 
an element sj of G’ made to correspond to an element sg’ of G’’, so that (sé )"— 
in Gé corresponds to in Gg’, and sé and transform Gj and Gj’ 
respectively so that corresponding elements go over into corresponding elements. 
We postpone the application of these theorems even to our modest determina- 
tion of the polyadic groups of the first three orders until our detailed study 
of cyclic polyadic groups of finite order gives us some basis for comparison 
of polyadic groups. 

However, one result of some theoretical interest emerges immediately. 
From our general determination theorem, it follows that the number of m-adic 
groups with g given symbols as elements is no greater than the number of 
2-groups on g other given symbols as elements times g times the largest num- 
ber of automorphisms a 2-group of order g can have. We may therefore con- 
clude that the number of abstract m-adic groups of given finite order g is a 
bounded function of m. 


| 
fi 
| 


E. L. POST [September 


II. FINITE POLYADIC GROUPS 
A. m-ADIC SUBSTITUTIONS AND SUBSTITUTION GROUPS 


9. The symmetric m-adic substitution group of degree m. An ordinary 
substitution, finite or infinite, may be considered to be a 1-1 correspondence 
between the members of a class I and the members of the same class. Let 
now Ii, T's, - - - , 'm-1 be an ordered sequence of m—1 equivalent classes. By 
an m-adic substitution on T;, T2, - -- , I'm—1 we shall mean a transformation 
which in 1-1 fashion carries the members of I; into those of I'2, of IT’: into 
those of T;,---, of Im—1 into those of T,(*8). Symbolically we shall write 
1. Intrinsically, therefore, the I'’s really enter 
into an m-adic substitution as a cycle, with T, following ['n1. If s: and se 
represent two m-adic substitutions on the same sequence of I'’s, we may as 
usual refer to 5:52, the product of s; and se, that is, the transformation equiva- 
lent to performing s; followed by se. But in general, for m>2, the product of 
two m-adic substitutions will not be an m-adic substitution on the given 
sequence of I’’s, for it will transform TI; into T;, instead of T';. On the other 
hand, the product of m m-adic substitutions on the m—1 I’’s will again trans- 
form - - - , T 1, and hence we can expect to have m-adic 
groups of m-adic substitutions(**). We can likewise expect to have m-adic 
groups of u-adic substitutions provided u—1 is a divisor of m—1. However, 
by m-adic substitution group we shall understand the former, that is, a set of 
m-adic substitutions, all on the same sequence of I'’s, and forming an m-adic 
group under the product of m substitutions as operation(*). 

When the [’s are mutually exclusive, an m-adic substitution can be given 
by an ordinary substitution where the one class I is the logical sum of the 
given I’s. On the other hand, when the ['’s have common elements, an m-adic 
substitution cannot in general be thus considered, since one and the same ele- 
ment may be transformed into different elements according to the I; of which 
it is considered to be a member. We shall restrict our attention to the former 
case(“!), But our results will be foreshadowed not by considering the resulting 


(**) Our language is that of transformation; that is, we shall say “a is carried into b” 
where the language of substitution would say “a is replaced by b.” 

(*) On the other hand, the product of m m-adic substitutions not all on the same sequence 
of I’s will “usually” fail to be an m-adic substitution for any sequence of I'’s. Hence the 
straight-laced definition following. 

(*°) The following generalization of ordinary substitution likewise suggests itself in con- . 
nection with the schar concept. For but two equivalent classes T':, T':, consider transformations 
which in 1-1 fashion carry the elements of I; into those of T',. If A, B, C are three such trans- 
formations, then AB™'C is also such a transformation. Note that here the product of two such 
transformations does not, in general, even exist. 

(*) For simplicity. If each member a of I; is replaced by the couple (i, a), I’’s not mutually 
exclusive become mutually exclusive, and it is then readily seen when results obtained for mu- 
tually exclusive I’’s hold for arbitrary I’’s. Actually, our results were first obtained for arbitrary 
I’s. But that they are so little affected by the overlapping or nonoverlapping of the I’s indi- 
cates that we have left wholly unexplored the more interesting part of the complete theory. 


Be 
248 


1940] POLYADIC GROUPS 249 


m-adic substitutions special types of ordinary substitutions, but generaliza- 
tions of ordinary substitutions, reducing to the latter when m=2. 

We further restrict our attention to the case where the I’s are finite 
classes, and hence consist each of the same finite number of members 2. The 
analogy with an ordinary substitution will be furthered by saying that the 
m-adic substitution is then of degree n. Let then the members of I’; be symbol- 

*, @qm—1yn- Corresponding to the primitive mode of writing ordinary 


substitutions we have the following form for any m-adic substitution on 


12 Gin 


13" 13” 


where the ith row is some permutation of (aiai2- - - din) except for i=m, 
when it is a permutation of the first row, and each letter is carried into the 
one immediately below it by the substitution. If, as suggested above, we con- 
sider our m-adic substitution an ordinary substitution on all the letters a;;, 
it can also be written in standard form as a product of cycles on different 
letters. In that case, each cycle will have a multiple of m—1 letters, these 
letters cyclically running through the m—1 I’s. 

Since an m-adic substitution of degree m is thus determined by m—1 
independent permutations of elements each, we thus see that there are 
(n!)™—! m-adic substitutions of degree n, the sequence of I’s being under- 
stood given. Observe again that if s,, se, - ++, Sm are m-adic substitutions 
on T2,-- +, their products - - Sm is also an m-adic substitution 
on Ty, - , In detail, s; will carry a;; into some S2 will carry 
(441); into some , and Ss, a resulting a;;"—™ into a@¢41;™. Hence 
$1S2° + * Sm Carries a;; into @(:41);™ as required. It then easily follows that the 
(n!)"-! m-adic substitutions of degree m constitute an m-group under the 
operation $152 - - * Sm. While the corresponding result holds good apart from 
our hypothesis of finite mutually exclusive I'’s, for the present case it suffices 
to reinterpret our m-adic substitutions as ordinary substitutions. Condi- 
tion 2 for an m-group then follows from the associative law for the multi- 
plication of ordinary substitutions. As for condition 1, the case where all s’s 
but Sm41 iN Sm=Sm41 are given m-adic substitutions has been taken 
care of. And if all but s; are given m-adic substitutions, 1 Sim, by letting s; 
run through the (!)"—! possible m-adic substitutions, siS2 - - - $, must do the 
same, and hence equals s,,4; for one and only one m-adic substitution s;. 

We shall call this m-group of order (m!)"—! the m-adic symmetric group of 


q 
{ 
| 
5 


250 E. L. POST [September 


degree n. It clearly becomes the ordinary symmetric group of degree when 
m=2. As in the case of ordinary substitution groups, every m-adic substitu- 
tion group on T;, Ts, - - - , m1, or briefly of degree m, will be a subgroup of 
the m-adic symmetric group of degree n. It readily follows that the necessary 
and sufficient condition that a finite set of m-adic substitutions all on the same 
sequence of I'’s form an m-adic substitution group is that the product of any 
m substitutions in the set be in the set. 

Of special interest are those m-adic substitutions of degree m in which the 
last row is an exact repetition of the first row. There are clearly (m!)"-? such 
substitutions. If s be such a substitution, s"—! clearly carries each letter into 
itself, and hence s™=s. Conversely, if s"=s, s must be such a substitution. 
According to a definition already given, s is then of m-adic order one. The 
unit class with s as sole member therefore itself constitutes an m-adic sub- 
stitution group of order one. Hence the m-adic symmetric group of degree 
has (m!)™-? first order elements, and correspondingly (!)"-? subgroups of 
order one. For m=2 these become the sole identity of the group. 

10. 2”—!-fold classification of m-adic substitutions ; the m-adic alternating 
groups. The classic theory of positive and negative substitutions involves the 
use of the determinant 


1 a 
1 ade 


1 


which is left invariant under every positive substitution on the letters 
@;, @2,*-*, Qn, and is transformed into its negative under every negative 
substitution on those letters. We generalize this theory by the same means. 
We now form the m—1 determinants A,, As, - - - , Ani, where A, is the 
determinant A for the letters ai, a2, din of T';, and transform them 
accordingly to a given m-adic substitution 
a1 


G25.’ 
If in the ith row each letter a;; is rewritten a(;41);(**), then the new ith row 


together with the old (¢+1)-st row defines an ordinary substitution on the 
letters of the (7+1)-st row. The transform of A; under the m-adic substitu- 


(*) when += m—1. Likewise, below, is when i=m—1. 


2 n—1 
| 

2 n—1 
ae 

2 n—1 

| 

x 


1940] POLYADIC GROUPS 251 


tion is clearly the transform of A;4; under this ordinary substitution, and 
hence is A;4:, or its negative, according as this ordinary substitution is-posi- 
tive or negative. We therefore have under the m-adic substitution 


A; — Az — 52A3, Am—1 — Sm—141, 51, 52, °° = +1. 


With each m-adic substitution there is thus associated a sequence of m—1 
numbers 52, - , whose values are +1 or —1. Clearly, when n>1, 
an m-adic substitution of degree m can be written down for every possible 
assignment of values to the 6’s. The m-adic substitutions of degree n, n>1, 
thus fall into 2”-! mutually exclusive classes corresponding to the 2"! possi- 
ble 5-sequences 52,---, Smal. 

Given m m-adic substitutions s;, with the corresponding 6-sequences 
542, --- , the m-adic substitution - - - Sm has a 5-sequence 
[5:, 52,-+-, Sm] which depends only on the 5-sequences of the s,’s. In 
fact, by following through the effect of the succession of substitutions 
$1, * » Sm On the determinants Aj, As, - - - , A,_1, we obtain the following 
equations for determining [61, 52, - - ; 


5) = 831529 (m—1)5 m1, 
= 51 ¢m—1)521 ¢m—1) (m—2)5m(m—1)+ 


Now let K be the class of the 2™~! possible 5-sequences. If 01, 72, - - - , om are 
any m such 6-sequences, and @ is the 6-sequence obtained from 0, 02, --- , Om 
in accordance with the above equations, an m-adic operation k(oi102 - - - om) 
is determined such that c=k(o.102 - - - om). It is then readily shown that K 
constitutes an m-group under k. In fact, condition 1 for an m-group is im- 
mediately verified by referring to the above equations. And condition 2 fol- 
lows from the associative law for m-adic substitutions, and the fact that if 
$1, Sa, * * » Sm are substitutions corresponding to - , Om respectively, 
$182 * Sm corresponds to k(a102 - Om). We shall call this m-group of order 
2™-1 the complete m-adic 5-group(*). 

Consider now any m-adic substitution group of degree m and form the 
class K’ of 5-sequences corresponding to its members. Since the product of 
any m substitutions of the group is in the group, the k& product of any m 
5-sequences in K’ will be in K’. As K’ is a subclass of the class of members 
of the complete m-adic 6-group, and the latter is finite, this suffices to prove 


(*) Actually, then, we have established a homomorphism between the symmetric m-adic 
substitution group of degree n, n >1, and this complete m-adic 6-group. The rest of this section 
could then largely have been given as a consequence of our general results on homomorphisms 
between m-adic groups, as could indeed the very fact that K is an m-group under k. In the 
generalization of this section occurring in the last section of our paper full use will be made of 
the concept of homomorphism. 


| 

| 
i 

{ 

i 

| 


252 E. L. POST : [September 


the following. The 5-sequences corresponding to the members of any m-adic sub- 
stitution group of degree n form the complete m-adic 5-group, or a subgroup 
thereof. 

By means of the above equations for the k operation we readily prove, 
as for ordinary substitutions, that every m-adic substitution group of degree n 
has the same number of substitutions for each 5-sequence in the corresponding 
“§-subgroup” (“). In fact, let sm and Sm41 be any two substitut‘ons in the group 
corresponding to any two given 6-sequences o» and om41 of the corresponding 
5-subgroup, and choose 51, , Sm-1 SO that Sm18m=Sm41. If now we 
let s» run through all the substitutions in the group corresponding to om, Sm41 
assumes an equal number of values in the group all corresponding to Om41. 
Hence there are at least as many substitutions in the group corresponding to 
one 6-sequence as to another, and consequently, by reciprocal reasoning, the 
same number. Since the order of the complete m-adic 6-group is 2”-', that 
of a subgroup thereof must be of the form 2#(*5). From the above result it 
follows that the order of an m-adic substitution group is a multiple of the 
order of its 6-subgroup. We therefore have as a corollary of the above result 
every m-adic substitution group of odd order has a 6-subgroup of order one, that 
is, all of its substitutions correspond to one and the same 6-sequence. 

Applied to the symmetric group itself, the above result shows that the 
2-1! mutually exclusive classes into which the m-adic symmetric group of de- 
gree 7 is divided all have the same number of members. Now given any sub- 
group of the complete m-adic 6-group, form the class C’ of all the m-adic 
substitutions of degree ” corresponding to each 5-sequence in the given 6-sub- 
group. The product of any m substitutions in C’ will therefore be in C’. Hence 
the members of C’ form a subgroup of the symmetric group. By analogy with 
ordinary groups we shall call it an m-adic alternating group. Consequently, 
there are as many m-adic alternating groups of degree n, n>1, as there are sub- 
groups of the complete m-adic 6-group, each alternating group consisting of all 
the substitutions of the symmetric group with 6-sequences in the corresponding 
6-subgroup. We may now further state that there is a one-many correspond- 
ence between the m-adic 6-subgroups and m-adic substitution groups of de- 
gree n, n>1, that is, between the class consisting of the complete m-adic 
. 6-group and its subgroups, and the m-adic symmetric group of degree m and 
its subgroups; and this correspondence is preserved under the relation “group 
or subgroup of.” : 

For m=2 the complete 6-group is the cyclic group of order 2, and its sole 
subgroup, the identity, corresponds to the sole ordinary alternating group of 
degree n(**). For m=3 the complete 6-group is of order 4. By direct calcula- 


(**) We shall use the phrase 5-subgroup to cover the complete 5-group as well. 

() By Lagrange’s theorem for polyadic groups—proved in §4. 

(4°) van der Waerden has already noted the homomorphism between any substitution 
group having at least one odd substitution and this cyclic group of order two. 


4 

4 


1940] POLYADIC GROUPS 253 


tion we find it to possess exactly four subgroups, that is, with classes of ele- 
ments ({[+1, +1)), —1)}), ({+1, +1], [—1, —1)), ({[+1, —1], 
[—1, +1]). Hence, there are exactly four triadic alternating groups of de- 
gree n,n>1. 

Thanks to B. P. Gill, we are able to determine the m-adic alternating 
groups of degree m for arbitrary m. For this purpose it is essential to obtain 
a suitable representation of the associated ordinary group of the complete 
m-adic 6-group. The.ideas leading up to this are of more general application, 
and hence at least part of the following digression. 

11. Associated and containing ordinary groups; commutative m-adic sub- 
stitutions. The substitutions of an m-adic substitution group G of degree n, 
considered as ordinary substitutions on (m—1)m letters, generate an ordinary 
substitution group which satisfies our definition of a containing group of G. 
With G thus an m-adic group of m-adic substitutions, this containing group 
will be of index m—1, and hence simply isomorphic with the abstract contain- 
ing group G* of G. We shall therefore use it throughout to represent G*, and 
for simplicity symbolize it G*. We may likewise refer to the associated group 
of G with respect to this containing group as Go. 

In the terminology of §6, the ith coset of G* consists of the products 
of 7 elements of G. To avoid duplication, it will be convenient hence- 
forth to assume that 1<7<m-—1. Since each substitution in G transforms 
- - - it follows that the ith coset of G* consists of 
transformations which in 1-1 fashion carry the members of each I; into those 
of T';4:, 7+2 reduced modulo m—1 if need be. We may therefore call these 
substitutions of G* the i-ads of G. In particular, Go, which consists of the 
(m—1)-ads of G in G*, consists of transformations which transform each IT; 
into itself. Each (m—1)-ad of G thus appears in G* as the product of m—1 
ordinary substitutions, each of these ordinary substitutions being on the let- 
ters of a single T'. We have incidentally verified that G* is of index m—1. 

Considered as ordinary substitution groups on (m—1)m letters we see that 
for m>2, G* is imprimitive with systems of imprimitivity T,, Ts, - - - , Mm—i, 
while Go is intransitive with the letters in each [ carried into letters of the 
same I’ only, by every substitution of G. If then for each I we separate 
from each substitution in Gp the substitution involving only the letters of 
that I’, there results an ordinary substitution group on the letters of that I. 
We shall symbolize these m—1 groups on the letters of Ti, Ts, --- , Uni by 
Gg, Gd’, - ++, Go™-» respectively, and call them the associated constituent 
groups of the m-adic substitution group G. It is then significant that the as- 
sociated constituent groups of an m-adic substitution group are conjugate ordi- 
nary groups. In fact, recall that Go is an invariant subgroup of G*, and hence 
is invariant under every m-adic substitution s in G. Now s carries the letters 
of each IT into those of I';4;. Hence, when the substitutions of Gp are trans- 
formed by s, the components of these substitutions on the letters of I; be- 


| 

| 

7 


254 E. L. POST [September 


come the components of the same class of substitutions on the letters of I';4:. 
We thus have specifically 


| -1 
s Gos s Go's 


for every s in G. P 

If s; and s2 are m-adic substitutions on the same sequence of I's, we may 
consider them as elements of the corresponding m-adic symmetric group. The 
transform of sz under s; is then sj's2s; in the notation of the containing group 
of the symmetric group, and hence may be obtained by the ordinary rule for 
transforming substitutions. Restated for our primitive mode of representing 
m-adic substitutions, this rule becomes the following. Replace each letter in s2 
by the letter immediately under it in s; and rewrite in standard form. Thus, 
to illustrate, let 


411012013 411012013 422023021 411012013 


-1 
S2 = 22021023, $1 = 22023021; = @13012011 = 


11013012 413011012 422021023 212011013 


Actually, the result before it is rewritten defines the transform equally well; 
for, as stated before, it is really the cycle, rather than the sequence, of I'’s 
that is significant. 

If sp is invariant under s:, then s; and sz are commutative; and conversely. 
f The problem of determining all m-adic substitutions s, commutative with a 
given m-adic substitution s; of degree m, and on the same I’’s, is best treated 
by writing the substitutions in ordinary cycle form. We recall that the num- 
ber of letters in each cycle is then a multiple of m—1. If s; consists of a 
single cycle, the ordinary substitutions 7, on the (m—1)m letters of s1, which 
are commutative with s:, are the (m—1)m ordinary powers of s:. Of these 
exactly m, i.e., those of the form s{‘-»+!, are m-adic substitutions on 
T,, Ts,---, Im-1. We shall later call these the m-adic powers of si, i.e., 
the elements of the m-adic group generated by s,. Hence, the only m-adic 
substitutions on the I'’s of 1, commutative with the single cycle m-adic 
substitution s; of degree m, are the m m-adic powers of s:. We now have no 
difficulty in paraphrasing the corresponding argument for ordinary substitu- 
tions, and obtain the following results. If s; consists of \ cycles with numbers 
of letters (m—1)m, (m—1)me, - - - , (m—1)m, no two of which are equal, the ° 
m-adic substitutions s, on the I’s of s1, commutative with s,, are the 
Nm, +--+ m, products of the m-adic powers of the several cycles. And, if s: 
consists of k equal cycles of (m—1)» letters each, the m-adic substitutions s 
on the [’’s of s; commutative with s; are v*k! in number, there being v* such 
m-adic substitutions for each of the k! possible permutations of the k cycles. 
Clearly in any case, the m-adic substitutions s, commutative with an m-adic 
substitution s;, and on the [’’s of si, constitute an m-adic substitution group. 


a 
Baa 
—1,_(m—1) 
‘ 


1940] POLYADIC GROUPS 255 


12. Further study of the complete m-adic 5-group and m-adic alternating 
groups. The ideas of the preceding section enable us to clear up a certain 
difficulty in our presentation of the 2"~!-fold classification of m-adic substitu- 
tions and in its consequences. Observe that whereas the I’;’s are mere classes, 
the determinants A; assume the letters in each I; arranged in a sequence. 
The 5-sequence associated with a given m-adic substitution s will therefore 
in general depend not only on s but also on the original ordering of the letters 
in the T’,’s. However, we shall see that the same 2”—' classes are obtained 
no matter what ordering is assumed, only their description by 5-sequences 
being thus affected. 

Actually, this ordering is equivalent to a first order m-adic substitution so 
which carries each a;; into @¢41);, 7+1 being replaced by 1 when t=m—1. 
Let us then write the m-adic symmetric group in coset form with arbitrary 
element s=¢so. ¢ is then in the form t’t’’ - - - ¢™-” where ¢® is an ordinary 
substitution on the letters of I';. If now we associate with ¢ the e-sequence 
(€1, €2,°-° +, €m-1), Where €; is +1 or —1 according as ¢“ is a positive or 
negative substitution, we see from the effect on the determinants A; that the 
€-sequence of ¢ is identical with the 5-sequence of s. Let then s;=f59 and 
S2=ts9 have the same 6-sequence, and hence and the same e-sequence. 
Then s\sz!=ttz' will have an e-sequence (+1, +1,---, +1), i.e., will be 
the product of positive substitutions only. Conversely, if sis! is the product 
of positive substitutions only, the corresponding ¢; and ¢ must have the same 
€-sequence, and s; and se the same 5-sequence. Hence, s; and s2 belong to the 
same one of the 2"—' classes of m-adic substitutions when and only when s;sz"' 
is the product of positive substitutions on the letters of the several I'’s. As 
this criterion is independent of so, the intrinsic character of our classification 
has been demonstrated. 

The €-sequences may be used to obtain a concrete representation of the 
associated ordinary group of the complete m-adic 6-group. More generally, 
consider the containing group of the m-adic symmetric group of degree n, 
n> 1. Since each i-ad R thereof is the product of i m-adic substitutions, R will 
transform the A’s according to some scheme 


With R we may thus associate the 7-sequence (with subscript) 
m—1} If thus corresponds to R; to 

t» } igs RR: will correspond to {nf 

nw 104, }i,+i, Subscripts being reduced modulo m—1 if need be. It follows 
that the containing group of the m-adic symmetric group is homomorphic to 
the resulting complete 7-group (with subscript). Now with i=1, the 7-se- 
quence is nothing more than the 6-sequence of the corresponding m-adic sub- 
stitution. From the way in which our operations were obtained it follows that 
the complete 7-group may be considered a containing group, of index m—1, 


| 
| 
j 
| 

a 


256 E. L. POST : [September 


indeed, of the complete m-adic 6-group. The associated group of the complete 
m-adic 6-group will then be composed of the n-sequences whose subscript is 
m—1. But going back to the A’s we see that these y-sequences are then ac- 
tually the e-sequences of the corresponding (m—1)-ads of m-adic substitu- 
tions. Under this representation, therefore, the operation of the associated 
group of the complete m-adic 5-group, i.e., of the complete e-group as we shall 
call it, becomes 


We therefore see that the complete ¢-group is an ordinary abelian group of 
order 2”—!. Since each ¢€ is +1, its elements other than the identity are all of 
order two, so that it is indeed of type (1, 1,---, 1). 

The complete m-adic 6-group is therefore semi-abelian. As it is readily 
seen to be non-abelian whenever m>2, it follows that it then has no invari- 
ant element. More specifically, the transform of [8;, 52, 53,---, 5n-1] by 
[5/, df, df, is easily found, via the complete y-group, to be 


[5m—15m—101, 615152, 595953, Sm—25m—25m—1 |. 


The condition for invariance is then easily rewritten 6,6/ = 6:67 = 636; = - - - 
=6n—16m1. It follows that there are exactly two 6-sequences leaving any 
given 6-sequence invariant, namely, [5:, 52, +++, and 

The present and succeeding paragraph presuppose a partial reading of the 
later §21 and §22. We have observed that except for the identity the elements 
of the complete ¢-group are all of order two. While it follows therefrom that 
the elements of the complete m-adic 6-group are of no other m-adic orders 
than one or two, we find directly that exactly half of them are of order 


one, half of order two. Thus, if is the 6-sequence [5:, ds, - Sma], and 
=5152 - - - dni, then, with & as in §10, we find [5051, 
, bodm-1], R(oo R(oo - - o)) 52, , bmi]. Hence, the m-adic 


order of a 6-sequence is one or two according as the product of its 6’s is +1 
or —1. 

The cyclic subgroups of the complete m-adic 6-group are therefore of 
orders one or two, there being 2”~? first order subgroups, and, for m>2, 
2™-2 or 2™-* cyclic second order subgroups according as m is even or odd.° 
Our result on the 6-sequences leaving a given 5-sequence invariant, coupled 
with the easily verified fact that an m-group of order two must be abelian, 
leads to the result that the complete m-adic 6-group has exactly 2"-? second 


(47) Actually, by a slight change in point of view, the transformation of the A’s resulting 
from an m-adic substitution can be considered an m-adic linear transformation in one variable 
in the sense of our later §35. The present and several other formulas, derived independently in 
the present section, would then become special cases of the formulas of §35. 


a 
| 
Bes) 
i 
| 
2 
| 


1940] POLYADIC GROUPS 257 


order subgroups for m>2. Hence, when m is odd, half of them are non- 
cyclic(*). 

We turn now to the determination of all the subgroups of the complete 
m-adic 6-group, and consequently, the determination of all m-adic alternat- 
ing groups. Since the complete m-adic 6-group is semi-abelian, all of its 
elements transform a given €-sequence €:, - - - , €m—1) into the same e-se- 
quence. As before, we can employ the operation of the complete n-group, and 
thus find the unique transform of (€:, €, - - - , €m-1) under every 5-sequence 
to be (€m-1, €1, €2, °° * » €m—-2). Now if H is a subgroup of the complete m-adic 
6-group, its associated ordinary group Ho must be a subgroup of the com- 
plate ¢-group invariant under H. Hence Ho can only be such a subgroup 
of the complete ¢-group that if (€, €,---, €m—2, €m—-1) is in the subgroup, 
(€m—1, €1, €2,° °°, €m-2) also is in the subgroup. The determination of these 
“admissible” subgroups of the complete ¢-group is the only difficult part of 
our problem. It was carried through independently by Gill; but he later found 
that his solution followed essentially the lines of the general theory of the 
“Verallgemeinerte Abelsche Gruppen,” abbreviated V.A.G., as given by Otto 
Haupt in the second volume of his Algebra(*). 

Following Gill we replace the two values +1, —1 by 0, 1 respectively. 
If an €-sequence be thus rewritten, the dyadic operation of our complete 
€-group is best written in additive form, and we have 


(€11, €12, °°" €1¢m—1)) + (€21, €2(m—1)) 
= (€11 + €21, €12 + €22, €1¢m—1) + €2¢m—1))y 


where addition within the parentheses is modulo 2. Now let ¢(x) be any 
polynomial in x with coefficients 0 or 1. With a@ any e-sequence, a 
unique €-sequence ¢(x)-a@ is determined as follows. If a is the e-sequence 
(€1, €2,°-° €m—2, €m—1), let x-a be the e-sequence (€m_1, €1, €2,° °°, €m—2)- 
With 1-a=a, and x*-a defined inductively through x"-a=x-(x"-!-a), we can 
define ¢(x) -a as the sum of the e-sequences obtained by operating on a by the 
several terms of ¢(x). We now observe two things. First, every e-sequence can 
be written $(x)-(1, 0,---, 0). In fact, to obtain (€,, €, -- +, €m—-1), we need 
merely let (x) =e: +ex+ Secondly, with (0, 0, ---, 0) ab- 
breviated 0, we see that (1, 0,---, 0) satisfies the equation (x"-!+1) 
-(1, 0,---,0)=0, but fails to satisfy any equation ¢(x)-(1, 0,---,0)=0 
with $(x) of degree less than m—1, and not identically zero. For we have 
directly that x™—!-(1, 0,---, 0)=(1, 0,---, 0); while with ¢(x) of degree 
less than m—1 our previous expression for ¢(x)-(1, 0,--- , 0) applies. Note 
finally that 0 and 1 constitute a field K under addition modulo 2, and multi- 
plication. The entire theory of V.A.G.’s in general, and Theorem 3 of Haupt 


(48) See §23 for the consequent structure of these second order subgroups. 
(4°) Otto Haupt, Einfiihrung in die Algebra, Leipzig, 1929, vol. 2, pp. 617-621. The result 
we need is the Theorem 3 of page 620. 


{ 

ij 
H 

if 

ij 

i 

ij 


258 E. L. POST : [September 


in particular, can then be shown to be applicable, and yield the following 
result. 

The admissible subgroups of the complete m-adic €-group are in 1-1 corre- 
spondence with the polynomial divisors, other than unity, of x™—!+-1 relative to the 
field of coefficients K. If r(x) be such a divisor, and a=r(x)-(1,0,---,0), then 
the corresponding subgroup consists of all distinct e-sequences (x) -a. 

Actually, if u is the degree of (x”-!+1)/r(x), then $(x) can be restricted 
to degrees less than y, different ¢(x)’s then also giving different €-sequences. 
It follows that the order(®) of the corresponding subgroup is 2#. The sub- 
group corresponding to 7(x) can also be described as consisting of all €-se- 
quences 6 such that (x™-!+1)/r(x)-b=0. It follows that these subgroups 
satisfy the same properties with respect to the relation of inclusion as do 
the subgroups of an ordinary cyclic group, (x"~!+1)/7r(x) taking the place of 
the order of the subgroup. Note that the unique factorization theorem applies 
to polynomials with coefficients in a given field. If then x™-!+1 is thus com- 
pletely factored, the distinct divisors r(x) can immediately be written down. 
Since x"~!+1=(x+1)(x"™-?+ - +--+ +x+1) relative to K, x+1 is always one 
of the prime divisors of x"~!+1. It can readily be shown that it is the only 
distinct prime divisor of x"-!+1, that is, that x™-'+1=(x+1)™—"' relative 
to K, when and only when m—1 is itself a power of 2. The different r(x)’s are 
then (x+1), (x-+1)?, ---, and each corresponding subgroup con- 
tains the next. : 

Having determined the admissible subgroups of the complete e-group in 
accordance with the above theorem, it is a simple matter to find the sub- 
groups of the complete 6-group. We return here to our original notation. Each 
6-subgroup H, if written in coset form, will be given by H=Hoo, with Ho an 
admissible e-subgroup, o a 5-sequence. Hence, if Hoo is known to be a 6-sub- 
group, its elements can immediately be found from Hp and a by the relation 


(€1, €2) * » €m—1) = €m—18m—1]> 


a mere specialization of the dyadic operation of the complete -group. 

Since every admissible e-subgroup H) is invariant under every 6-sequence 
o, it follows from an early theorem of §4 that Hoo will be a 5-subgroup for 
every first order a, and for those second order o’s for which o”~' is in Hp. 
The distinct 5-subgroups thus arising will then be all the 5-subgroups for a 
given Ho. Now when is even, every 6-subgroup must have at least one first 
order element. Hence in this case, the 6-subgroups corresponding to Hp will 
be all the distinct Hoo’s with o a first order element. Now it is readily proved 
that if r=(€1, €, ---, €m—1), the order of to is the same as that of o, or op- 
posite, according as €)=€1€2 - - - €m-1 is +1 or —1, and furthermore, that the 
elements of Hp either all have €9 equal to +1, or exactly half have ée5-= +1, 


(®) In the ordinary sense, not that of V.A.G.’s. 


1940] POLYADIC GROUPS 259 


half —1. Hence, if Ho is of order 2, Hoo, with o of first order, has 2 or 24-! 
first order elements according as the elements of Hy have or have not €0's 
all +1. Since the distinct Hoo’s with given Hy are mutually exclusive, while 
each of the 2”~? first order elements of the complete m-adic 5-group is in some 
Hye, it follows that when m is even, for each admissible e-subgroup Hp of 
order 24 there are exactly or 2”-“—! corresponding 6-subgroups accord- 
ing as the €9’s of the elements of H) are, or are not, all +1. 

For m odd, and given Hp, we also have these subgroups. But now there may 
be additional subgroups Hoe with all elements of order two. Now if a is of sec- 
ond order, the e-sequence of o”~! is readily seen to be (—1, —1,---, —1). 
It follows that these additional subgroups can arise only when Hy has the 
element (—1, —1,---, —1), while the €o’s of all its elements are +1. But 
then each of the 2"-? second order 6-sequences will be in one of these addi- 
tional subgroups. For such an Ho, therefore, in addition to the now 2”-#-? 
6-subgroups consisting wholly of first order elements, there will be 2”-“—? ad- 
ditional 6-subgroups each, indeed, consisting wholly of second order elements. 

Actually, the number of 5-subgroups with given associated ordinary group 
Hy can be determined without explicitly writing out the elements of Ho, but 
merely by an inspection of the corresponding r(x). Thus, we have already 
seen that the order of Ho is 2“, where yw is the degree of (x”™-!+-1)/r(x). By 
means of the second description given for the subgroup HA, it can further be 
shown that (—1, —1, ---, —1) isin Hp when and only when (x"-!+1)/r(x) 
has x+1 for divisor; while from the first description it can be shown that 
the €o’s of Ho are all +1 when and only when r(x) has x+1 for divisor. This 
covers all we need to know about Hp. 

In particular, for m>3, we always have the three distinct divisors of 
x™-1+-1 equal to x™-!+1, --- +x+1, x+1. In the first case Hp is of 
order one, and consists of but (+1, +1,---, +1), the identity. The corre- 
sponding 6-subgroups are the first order 6-subgroups listed above. In the 
second case Hy is of order two, and consists of (+1, +1,---, +1) and 
(—1, —1,---+, —1). It is obviously the only admissible second order e-sub- 
group, and hence the corresponding 6-subgroups are all of the second order 
6-subgroups as first listed. The third subgroup, of order 2"~?, is again the only 
admissible ¢e-subgroup of that order, and consists of all ¢-sequences with € 
equal to +1. Our general solution then shows that as a result there is but one 
6-subgroup of order 2”~? for m even, two for m odd. 

Actually, the equations of §10 for the m-adic operation on 5-sequences di- 
rectly show that we always have the subgroup of order 2”~? consisting of all 
6-sequences with 69= +1, and for m odd also the subgroup of order 2”~? con- 
sisting of all 5-sequences with 5)= —1. Since the complete m-adic 6-group is 
semi-abelian, all of its subgroups are semi-invariant. It is then of interest to 
note that the above one, or two, subgroups of order 2"~? are its only invariant 
subgroups. In fact, our formula for the transform of one 5-sequence by an- 


} 
i 
| 
4 
if 
| 
ih 
i 
th 


260 E. L. POST [September 


other shows that 49 is always thus left invariant; and it also shows that a 
6-sequence can always be found which transforms a given 6-sequence into any 
other with the same 5p. 

These results are immediately applicable to the corresponding alternating 
groups, assuming »>1. There are thus always 2”~? alternating groups with 
substitutions forming one of the 2”! classes of §10 and, for m>2, 2”-? alter- 
nating groups with substitutions forming two such classes. Passing by the gen- 
eral solution, we note that the conditions 6)= +1, and 6) = —1, correspond to 
an m-adic substitution considered as an ordinary substitution being positive, 
or negative. Hence, the only alternating groups invariant under the sym- 
metric group are the alternating group of all positive substitutions, and, for 
m odd, also the alternating group of all negative substitutions. On the other 
hand, every alternating group is a semi-invariant subgroup of the symmetric 
group. 

The last observation restricts the possible simplicity of m-adic alternating 
groups. Regarding the nonexistence of a quotient group of lower order than 
itself as the distinguishing mark of an ordinary simple group, we are led 
to define a simple m-group as one whose associated group has no subgroup 
other than the identity invariant under the m-group. It follows that for »>2 
only alternating groups corresponding to first order 6-subgroups can be sim- 
ple. For in any other case, (+1, +1,---, +1), the identity of the associated 
e-subgroup, is a subgroup thereof invariant under the 6-subgroup. Hence the 
elements of the associated group of the alternating group with e-sequence 
(+1, +1,---, +1) then constitute a subgroup of the associated group in- 
variant under the alternating group. We now proceed to show, on the strength 
of the corresponding result for ordinary groups, that when n>4 every alter- 
nating group H corresponding to a first order 5-subgroup is a simple. Since the 
associated e-subgroup has but the sole e-sequence (+1, +1,---, +1), the 
associated ordinary group H, of the alternating group H consists of all ele- 
ments ¢=t't’’. - - t™-) where ¢“ is any positive substitution on the letters 
of T;, and is thus the direct product of the ordinary alternating groups 
A,, Ao, +--+, Am—1 on the letters of Ti, Ts, ---, respectively. Let then 
K> be any subgroup of Hy invariant under H. If there could be more than one 
tin Ko with the same components ?’, t’’, - - - , ¢“-®), then there would be more 
than one ¢ in Ky with each component t’, t’’,- - - , t(-® the identity. Now 
these ¢’s must constitute a subgroup of Ko, and this subgroup will be invariant 
under Hy as a consequence of the invariance of Ky under Ho. The correspond- 
ing ¢“—»’s must then constitute an invariant subgroup of Ami, if not Am-1 
itself. Under the present supposition the last would be true; for with n >4, the 
alternating group A»_: is simple. But then Ko would coincide with Hp, instead 
of being a subgroup of Ho. For, Ko being invariant under any s in H, if we 
transform the above elements of Ko by s, s?, - - - , s-®), we would have in Ko 
every element of Hy any m—2 of whose components are the identity; and the 


1940] POLYADIC GROUPS 261 


products of these elements constitute Ho. We have therefore proved that an 
element t=?’t’’ - - - t(—» of Ko is uniquely determined by its first m —2.com- 
ponents. If then we transform ¢ by any element of Hp the first m—2 of whose 
components are the identity, the first m—2 components of t, and hence t it- 
self, will be unchanged. ¢‘-» is then always an invariant element of Am_1, 
and, again with >4, can only be the identity. The same argument would 
show each component of an element of Ky to be the identity, so that Ky is the 
identity. 

We have therefore proved that for n>4 the 2”—-? m-adic alternating groups 
of degree n corresponding to first order 5-subgroups are simple, the others not. 
For »=4 no m-adic alternating group is simple, since we can let Ko be the 
direct product of the axial groups on the letters of the several I'’s. Again, for 
n= 3, no m-adic alternating group is simple for any m >2. The preceding argu- 
ment breaks down at the one point where the invariance of t@— under A m_1 
is used to prove t‘"-» the identity. Ko may now be the third order group 
obtained from the simple isomorphism between Ai, Ao, ---, Am—i that re- 
sults when A; is transformed into A;4: by a fixed element s of H. Finally, 
when n=2, the very first step of our argument breaks down. The m-adic 
alternating groups can now be identified with the 6-subgroups themselves. 
The simple 6-subgroups are those whose associated e-subgroups have no ad- 
missible e-subgroup for subgroup other than the identity. Hence, in terms of 
the above general determination of admissible e-subgroups, the simple 5-sub- 
groups are those whose associated e-subgroups have (x”~-!+1)/7(x) prime. 

13. Transitive m-adic substitution groups. Since an m-adic substitution 
group G can carry the letters of '; only into those of T';,:, we are led to define 
a transitive m-adic group G as one whose substitutions will carry each letter 
of each IT into every letter of the succeeding I’. Clearly, the m-adic symmetric 
group of arbitrary degree m, and the m-adic alternating groups of degree n >2 
are then transitive. Our analysis in the next section shows that G will be 
transitive if the above condition is true for any one I, and indeed for any one 
letter of a I’, i.e., if the substitutions of G carry one letter of one I into 
every letter of the succeeding I, the same is true of every letter of every I, 
and G is transitive. 

It is readily proved that the containing 2-group G* of a transitive m-group 
G is transitive. In fact, let a;; and a(i+%); be any two letters of the I'’s. Con- 
sidering 1+k reduced modulo m—1, we may assume 15k Sm—1. We need 
not consider k=1. For k>1 let r be any (k—1)-ad. It will carry a;; into some 
Some will carry into @(i4%);. Hence the k-ad rs, which 
is a substitution in G*, carries a;; into @(i,%);, as required. Conversely, if G* 
is transitive, a substitution in G* carrying a;; into @(:41);, belongs to G, and 
hence G is transitive. 

In terms of Go, the associated 2-group of G, we likewise see that G is transi- 
tive when and only when the substitutions of Go carry each letter of each T 


| 
{ 
j 


262 E. L. POST . [September 


into every letter of the same I’. Recalling our definition of the associated con- 
stituent groups G/, Gs’, -- - , G&"~” of G, we thus have that G is transitive 
when and only when its associated constituent groups are transitive. As the 
latter are conjugate, it follows that G is transitive if any one of its associated 
constituent groups is known to be transitive. 

Let (Go)i; be the subgroup of Go which consists of all substitutions in Go 
that carry a;; into itself. If we expand G in right cosets as regards (Go);;, the 
members of each single coset carry a;; into one and the same letter of T'j4:. 
Also, if s; and se, of G carry a;; into the same letter, s;sz4 will be in (Go);;, so 
that s; and sz are in the same coset. Each coset therefore consists of all the 
substitutions of G carrying a;; into the corresponding letter. If then G is 
transitive of degree n, there will be exactly such cosets, one for each letter 
of T';4:. Hence, the order of (Go):; is equal to g/n if G is a transitive group of 
order g, and degree n. The order of a transitive m-adic substitution group is 
therefore a multiple of its degree. Furthermore, the number of substitutions of a 
transitive m-adic substitution group that carry any letter a;; into any letter a¢i+1)k 
1s, for all such pairs of letters, equal to the order of the group divided by its degree. 

Since for m>2 an m-adic substitution group cannot carry a letter into 
itself, we have to turn to the associated group of a transitive m-adic substitu- 
tion group for an average number of letters theorem. For this purpose we 
write the substitutions of the associated group in standard cycle form. Ob- 
serve first that each associated constituent group G being transitive, and 
of degree n, the average number of its letters appearing in its substitutions 
is n—1. Fixing our attention on G®, we consider the subgroup H® of Go 
consisting of all the substitutions of Go whose component in G® is the identity 
of G®. If we expand Gp in cosets as regards H(, each coset is easily seen to 
consist of all the substitutions of Gp which have a fixed component in G®. 
Each substitution of Gi? therefore occurs the same number of times in Go. 
It follows that the average number of letters of each T'; occurring in the substitu- 
tions of the associated group of a transitive group of degree n is n—1. This is 
our strongest result. From it, or from our discussion of (Go):;, we also have 
that the average number of all letters appearing in the substitutions of the associ- 
ated group of a transitive m-adic substitution group of degree n is (m—1)(n—1). 
This may also be seen as follows. Since the containing group G* of the transi- 
tive m-adic group G is transitive, and of degree (m—1)n, the average number 
of letters in its substitutions is (m—1)n—1. The total number of letters in its 
substitutions is then (m—1)g[(m—1)n—1], g being the order of G. Of the 
(m—1)g substitutions in G*, the (m—2)g substitutions not in Gy each has its 
full complement of (m—1)m letters. The total number of letters in the substi- 
tutions of Gy is thus the remaining (m—1)(m—1)g letters, whence the result. 

14. Intransitive m-adic substitution groups. Let G be any m-adic substitu- 
tion group on the letters of T;, Ts, ---, Imi, and let TY, TY, Imi be 
the subclasses of the letters of T,, Ms, - - - , 'm—1 respectively into which @(m—1)1 


1940] POLYADIC GROUPS 263 


is carried by the elements, dyads, - - - , (m—1)-ads of G. If s is any substitu- 
tion in G, then as r ranges through the i-ads of G, rs ranges through the 
(i+1)-ads of G. Hence s transforms the letters of I'/ in 1-1 fashion into the 
letters of I'/4; for each 7, and thus determines an m-adic substitution on 
Ty, Iw. Furthermore, if a;; and are any two letters of 
and I'/,; respectively, some s of G will carry a;; into @(41)x. For some i-ad 
of G carries @(m—1)1 into a;;, and some (i+1)-ad 12 of G carries @(m—1)1 into 
Hence element r7'72 of G carries ai; into The m-adic substitu- 
tions on T/, Ty, - - - , 1 obtained from all the substitutions of G therefore 
constitute a transitive m-adic substitution group oa Ti, ---, 
is transitive, this group is identical with G. If G is not transitive, we may call 
this group a transitive constituent group of the intransitive group G. In that 
case, by accounting for all the letters of 1, we obtain a number of transi- 
tive constituent groups of G such that every substitution in G is the product 
of a selection of substitutions from the transitive constituent groups of G. 

This result can also be obtained by analysing the containing group G* of 
G, whence it also appears that the transitive constituent groups of G* are the 
containing groups of the transitive constituent groups of G. 

The direct product and simple isomorphism methods for obtaining intran- 
sitive ordinary groups admit of immediate extension to m-adic groups. In 
the latter case, let G; and G2 be the same m-adic substitution group written on 
different letters. If the letters of form the sets of Ge, 
Ty’, Tg’, --- , the products of corresponding substitutions in G; and Gz 
will be m-adic substitutions on T;, T's, - - - , Imi, where I’; consists of all the 
letters of Tf and Ij . Clearly, an m-adic substitution group is thus formed 
simply isomorphic with G, and Ge, but of twice their degree. Similarly for any 
number of groups obtained by writing a given m-adic substitution group on 
different letters. 

As for the direct product method, let H; and H: be m-adic substitution 
groups on Ty, Iv,---, ry’, Ty’, ---, with all letters dis- 
tinct. As before, form T,, Ts, - - - , 'n—1. Then, if sf and s/’ be any two substi- 
tutions in H, and Hy, respectively, s/s/’ will be an m-adic substitution on 
T,, Ts, +--+, Tm. The set of all such products clearly constitutes an m-adic 
substitution group G of order equal to the product of the orders of H; and He, 
and degree equal to the sum of their degrees. When m=2, the existence of 
an identical element, coupled with the ambiguity of the cycle notation, allows 
us to consider H; and H; subgroups of G which can then be said to be gener- 
ated by H,; and He. When m>2 this is no longer possible. We shall therefore 
refrain from calling G the direct product of Hi and He, reserving that phrase 
for a more special concept found useful in the sequel(5*). 

15. Substitutions which are commutative with each of the substitutions 


(“) In fact, while G is an m-adic substitution group, the m-group “generated” by H; and 
His, for m>2, a hybrid sort of an affair of order m—1 times the order of G. On the other hand, 


4 
t 
t 
i 
i 


264 E. L. POST : [September 


of a transitive m-adic substitution group. Recalling that the order of a transi- 
tive m-adic substitution group is a multiple of its degree, we may most briefly 
define a regular m-adic substitution group as a transitive m-adic substitution 
group whose order is equal to its degree. In view of the corresponding general 
result for transitive groups, this is equivalent to defining a regular m-adic 
substitution group as an m-adic substitution group, which, for any pair of 
letters in consecutive I’s, has one and only one substitution carrying the first 
letter into the second. Other transitive group results, coupled with the order 
criterion of regularity, show that an m-adic substitution group is regular if 
and only if its containing group is regular; also, if and only if its associated 
constituent groups are regular. The orders of the associated group, and the 
associated constituent groups, then being the same, it also follows that a 
regular m-adic substitution group is a transitive group whose associated group 
has no substitutions other than the identity omitting a letter. Regular m-adic 
substitution groups play the same role in polyadic as in ordinary group the- 
ory, since we later show that every finite abstract m-adic group can be repre- 
sented as a regular m-adic substitution group. 

According to a theorem of Jordan, the substitutions on the letters of a 
regular group commutative with each of its substitutions constitute a group 
conjugate to the regular group and known as its conjoint. We extend this 
theorem to a regular m-adic substitution group G by directly applying it to 
the containing group G*, which is known to be regular. In fact, since G* is 
generated by G, the ordinary substitutions on the letters of G commutative 
with each of its substitutions are the same as those commutative with each 
of the substitutions of G*. Hence, to find the m-adic substitutions on the let- 
ters of G commutative which each of its substitutions we need merely pick out 
those substitutions in the conjoint of G* which are m-adic substitutions. 

To do this we must re-examine the standard proof of Jordan’s theorem. 
In this proof the letters on which the given regular group is written are re- 
placed by the symbols s; used for the substitutions in the group. Then, in 
the simple isomorphism established between the group and its conjoint, the 
jth substitution of the given group in its new form replaces each symbol s; 
by the symbol for the substitution s;s;, while the corresponding substitution 
in the conjoint replaces each s; by s;s;. Finally, it is shown that the given 
group is transformed into its simply isomorphic conjoint by the substitution 
which carries the second letter of each substitution of the given group into 
the second letter of the corresponding substitution of the conjoint when all 
the substitutions(®) are written in cycle form with the same first letter. 


if either H, or H2 has a first order element, G will contain a subgroup H,’ or H;,’ simply iso- 
morphic to H: or H; respectively; and if both H; and H: possess an invariant first order ele- 
ment, the corresponding H,’ and H,’ will generate G, and G will then be the direct product of 
H,' and H,’ in the sense later defined (§25). 

(®) All except the identity, that is. Likewise later in the proof. 


| 


1940] POLYADIC GROUPS 265 


For our purpose it suffices to determine the nature of this substitution 
in the case of the regular group G*. With G a regular m-adic substitution 
group on the letters of Ti, T2,---, Im, the substitutions of G* can be 
grouped into corresponding classes Ty, --- , according as they are 
elements, dyads, - - - , (m—1)-ads of G. When G* is rewritten in accordance 
with the proof of Jordan’s theorem, Ty, Ty,---, Imi take the place of 
T,, Ts, -- +, In fact, if s; is a k-ad in G*, s; an l-ad, s;s; is a (k+/)-ad. 
Hence, in the above description applied to G*, if s; is an J-ad, the jth 
substitution of G* transforms each ['/ into ['¢4:, and hence is an J-ad on 
Ty, TZ,---, Tw. But the same reasoning shows the corresponding sub- 
stitution in the conjoint of G* also to be an J-ad on Ty, TZ, --- , Iw_i. Or, 
returning to Ti, T2, ---, Ims, we have that in the simple isomorphism be- 
tween G* and its conjoint the correspondant of an 7-ad in G* is an i-ad. If then 
we write the substitutions of G* and its conjoint with the same first letter, 
say du, if the corresponding substitutions in G* and its conjoint are both 
t1-ads, their second letters will both be in I';,;. Hence, the substitution which 
transforms G* into its conjoint transforms each I into itself, and conse- 
quently is an (m—1)-ad on the letters of Ts, -- 

Our result now immediately follows. The (m—1)-ad will transform only 
the m-adic substitutions of G* into m-adic substitutions. Hence the m-adic 
substitutions in the conjoint of G* are the transforms of the m-adic substitu- 
tions in G* by the (m—1)-ad, i.e., the transforms of the substitutions in G. 
Since the transform of an m-adic group is a simply isomorphic m-adic group, 
we thus have the following extension of Jordan’s theorem. The m-adic sub- 
stitutions on the sequence of T's of a regular m-adic substitution group commuta- 
tive with all the substitutions of the group constitute a regular m-adic substitution 
group of the same order, and this group is the transform of the given group by an 
(m—1)-ad of m-adic substitutions. Clearly, the relationship between the two 
groups is a reciprocal one, and we may call each the conjoint of the other. 
Either directly, or as a consequence of a later general result on transforms, 
it may be verified that each group can be transformed into the other by an 
m-adic substitution, and hence, according to any m-adic definition, are con- 
jugate. 

We further have, as a result of the above discussion, that every ordinary 
substitution on the letters of a regular m-adic substitution group of degree n 
commutative with all the substitutions of the group are polyads of m-adic 
substitutions, there being ” such i-ads for every 7. Together they of course 
constitute the conjoint of the containing group of the given group; and this 
conjoint is now seen to be the containing group of the conjoint of the given 
group. 

In passing from regular m-adic substitution groups to arbitrary transitive 
m-adic substitution groups for the purpose of extending Kuhn’s theorem to 
m-adic groups, we shall adopt the viewpoint of the last paragraph, and seek 


\ 


266 E. L. POST [September 


all substitutions on the letters of the transitive m-adic group commutative 
with each of its substitutions; for now m-adic substitutions of this kind will 
exist only if the given group satisfies a special condition. If G is a transitive 
m-adic substitution group, G* is transitive. Again, the substitutions on the 
letters of G commutative with every substitution in G are the substitutions on 
the letters of G* commutative with each of its substitutions, and hence can 
be found by applying Kuhn’s theorem to G*. As before, we assume all sub- 
stitutions written in cycle form. 

According to Kuhn’s generalization of Jordan’s theorem the number of 
substitutions on the letters of G* commutative with each of its substitutions 
is the same as the number of letters omitted in all substitutions of G* which 
omit a given letter. Actually, such substitutions will be in Go, the associated 
group of G. Let {a3} designate the set of letters omitted by all substitutions 
of G* that omit a;;. Since G* is transitive, it follows that if a;,;, is in {az} . 
then {a;,;,} = ng and a substitution r of G*, carrying a;; into a;,;,, carries 
the set of letters {a;;} into itself. But r carries all the letters of I’; into all 
those of I’;,, and hence all the letters of {a;;} that are in I; into all the letters 
of {a,;} that are in I’;,. Hence, if there are a letters of {a;;} in one I’, there 
are a letters of {a;;} in every I’ that has at least one of them. Now with the 
I’s arranged in a cycle, let 5 be the least difference between the subscripts 
of consecutive I'’s that have letters of {a;;}. Then some 6-ad r in G* will 
transform the set ‘{as;} into itself. As r will then carry the letters of {a} 
which are in any I’; into letters of {a;;} which are in T';4s, it follows that the 
Is having letters in {ai;} have subscripts which are in arithmetic progres- 
sion, with the common difference, indeed, a divisor of m—1. Finally, the 
known properties of transitive groups show the different sets {ai;} to be 
mutually exclusive, and transformable into each other by the substitutions 
of G*. It follows that the numbers a and 6 are the same for all such sets; 
and since together they exhaust the letters of G*, that a is a divisor of n. 
Hence the following result. If G is a transitive m-adic substitution group of de- 
gree n, then the number of letters omitted by all substitutions of the containing 
group G* that omit a given letter is of the form xa, where x is a divisor of m—1, 
a a divisor of n; furthermore, there are a of these letters in every I that has at 
least one, the subscripts of these T’s forming an arithmetic progression. 

According to the proof of Kuhn’s theorem the resulting xa substitutions 
on the letters of G* commutative with each of its substitutions are obtained . 
as follows. Let Hu be the subgroup of G* composed of the substitutions of G* 
which leave the set of letters {aun} unchanged, and let Cy be the conjoint 
of the regular group Ku, on the letters {au}, formed by the components 
on those letters of the substitutions in Hy. For each substitution in Cu, 
form the product of all the distinct transforms of that substitution un- 
der G*. These products are the xa substitutions on the letters of G* com- 
mutative with each of its substitutions. According to our distribution result 


1940] POLYADIC GROUPS 267 


for the set of letters {an} , there are a of these letters in each of the x I'’s, 
Tiss, where xi =m—1. Clearly, the substitutions of Hy, can 
only be i-ads with 1=64, 26, - - - , x6. Since Ky is regular on the letters {au}, 
it will have, for each of the above 7’s, a substitutions which are components 
of t-ads in Hi. Our proof of the extended Jordan theorem applies sufficiently 
to the relationship between Ky and its conjoint Cy, to show that Cy also con- 
sists of a “components of i-ads” for each i=6, 26, - - - , x5. Now when a sub- 
stitution of Cy is transformed by the substitutions of G*, the set of letters 
{au} will go over into all the mutually exclusive distinct sets {a,;}, there be- 
ing one and only one distinct transform of the substitution of Cu for each 
set {az;}. If the substitution in question is a component of an i-ad, each 
transform will also be a component of an i-ad. As the sets {ai;} are mutually 
exclusive, and exhaust the letters of T;, T'2, - -- , Um—1, the product of these 
transforms will exactly constitute an i-ad. We thus have the following ex- 
tension of Kuhn’s theorem. The only substitutions on the letters of the T’s of a 
transitive m-adic substitution group commutative with each of its substitutions 
are polyads of m-adic substitutions on the same sequence of T's; in the notation 
of the distribution theorem, if =(m—1)/k, these polyads can only be i-ads with 
t=6, 26,--- , xd, there being exactly a such i-ads for each admissible 1. 

In particular, if we restrict our attention to m-adic substitutions, we have 
the following result. The necessary and sufficient condition that there be at least 
one m-adic substitution on the sequence of I's of a transitive m-adic substitution 
group commutative with each of its substitutions is that the subgroup of the asso- 
ciated group consisting of all its substitutions omitting a given letter in one T 
omits a fixed letter in the following I’; if then that subgroup omits exactly a letters 
from one T, it will omit a letters from every T, and there will be exactly a such 
m-adic substitutions. 

16. Holomorphs of a regular m-adic substitution group. The concept of 
holomorph of a regular group admits both of an immediate extension to regu- 
lar m-adic substitution groups, as well as of a further generalization peculiar 
to polyadic theory. For the immediate extension, let G be a regular m-adic 
substitution group of order ” on the letters of T;, T2, -- - , Um—s. Then all the 
m-adic substitutions on T;, Is, - - - , 'm—1 which transform G into itself con- 
stitute an m-adic substitution group of degree m which we shall call the prin- 
cipal holomorph of G. Clearly, the principal holomorph of G not only contains 
G, but also the conjoint of G. Since the transforms of commutative substitu- 
tions are commutative, it follows that the principal holomorph of G is in 
fact also the principal holomorph of its conjoint. 

If K is the principal holomorph of G, then (Ko), the subgroup of the as- 
sociated group Ky of K consisting of all the substitutions of Ky omitting au, 
may be identified as the group of isomorphisms of G. That is, (Ko) transforms 
G into all of its possible automorphisms, each automorphism being given by 
but one substitution of (Ko)u. In fact, the argument used in extending 


268 E. L. POST F [September 


Jordan’s theorem shows that the substitutions of one of two simply iso- 
morphic regular m-adic substitution groups on the same sequence of [’s can 
be transformed into the corresponding substitutions of the other by an 
(m—1)-ad which omits, say, au. Hence, every automorphism of G can be 
obtained by transforming G by the substitutions in (Ko). Furthermore, if 
two distinct substitutions of (Ko) yielded the same automorphism of G, 
a substitution of (Ko): other than the identity would transform each mem- 
ber of G into itself. But this substitution would have to be in the associated 
group of the conjoint of G, and, as this conjoint is regular, the substitution 
in question, which omits a1, could only be the identity. 

We can now prove, as in the ordinary case, that the order of the principal 
holomorph of a regular m-adic substitution group is equal to the product of the 
order of the group and the order of its group of isomorphisms. In fact, if G is 
the conjoint of the regular group G, K its holomorph, by expanding K in 
cosets as regards its invariant subgroup G, we see that the substitutions of K 
transform G in k/n different ways, k being the order of K. But K as well as 
(Ko)1. must transform G into all of its possible automorphisms. For if s is 
in K, tin (Ko)u, as ¢ runs through (Ko): giving all the automorphisms of G, 
ts in K yields an equal number of automorphisms of G. Hence the order of 
(Ko) is k/n, whence the above result. 

To illustrate this result, consider the cyclic triadic group of degree and 
order two generated by the triadic substitution s; = (@11d21012022) given in cycle 
form. The letters of T; are au, di2, of T's, de1, d22. The sole other triadic substitu- 
tion on I, 2 generated by 5; is 52 = (@11d22012021), so that the group is seen to be 
regular. sz also is a generator of the group, whence it follows that the group ad- 
mits exactly two automorphisms. Hence the order of its principal holomorph 
is four. We find directly that s3= (@11@21) (@i2d22) and S4= inter- 
change s; and se, so that the principal holomorph consists of 5), Se, $3, ss. It is 
actually the entire triadic symmetric group of degree two. This example 
serves to answer the question whether some subgroup of the principal holo- 
morph itself, instead of its associated ordinary group, can be identified with 
the group of isomorphisms of the given group. The answer in the present in- 
stance is no. For such a subgroup would have to possess as element 5; or sz to 
yield the identical automorphism, but would then have for elements both s; 
and Sse, each yielding that one automorphism. 

The immediate extension of the concept of complete group to m-adic- 
groups turns out to be rather trivial. Defining an m-adic group G to be com- 
plete in the narrow sense if its own elements transform it in 1-1 fashion into 
all of its possible automorphisms, we obtain the following result. Am m-group 
is complete in the narrow sense when and only when it is reducible to a complete 
ordinary group. In fact, its sole element yielding the identical automorphism 
must be of first order, and invariant under the group—hence the reducibility. 


1940] POLYADIC GROUPS 269 


The rest of the theorem follows from the easily demonstrated facts that if G 
is reducible to G’, every automorphism of one group is also an automorphism 
of the other, while the automorphisms induced by any element of either is 
the same for both. Since G can have but one invariant element, it also follows 
that the net of derived groups of an m-adic group complete in the narrow sense 
consists of a single complete 2-group, and its extensions, which are then also com- 
plete in the narrow sense. If G is regular, and complete in the narrow sense, we 
may use its elements as multipliers in the expansion of K in cosets as regards 
G. We shall express this fact by saying that the principal holomorph of an 
m-group complete in the narrow sense is the direct product of the group and its 
conjoint. A precise abstract definition of this rather narrow concept of direct 
product will be given in §25. 

We do not obtain a less restrictive concept of completeness by asking 
that the elements of Gy transform G into all of its possible automorphisms 
in 1-1 fashion; for the coset theorem shows that G and G,» transform G accord- 
ing to the same number of distinct automorphisms. We therefore define G 
to be complete in the wide sense if the elements of its abstract containing group 
G* transform it in 1-1 fashion according to all of its possible automorphisms. 
Since only the identity of G* is now invariant under G, it follows that an m- 
group complete in the wide sense is irreducible. If this m-group G is of order g, 
and is expressed as a regular m-adic substitution group, the order of its prin- 
cipal holomorph K will be (m—1)g®. We now turn to the containing groups 
for a direct product theorem, and easily find that the containing group of the 
principal holomorph of an m-group cemplete in the wide sense is the direct prod- 
uct of the containing groups of the group and its conjoint. 

Actually, a type of completeness can be defined for each divisor k of m—1, 
an m-group G being said to be complete in the k-sense if it admits some con- 
taining group of index k whose elements yield in 1-1 fashion all the auto- 
morphisms of G. We then have that an m-group is complete in the k-sense 
when and only when it is reducible to a (k+1)-group complete in the wide 
sense. Furthermore, the net of derived groups of an m-group complete in the 
k-sense consists of a single (k+1)-group complete in the wide sense, and its 
extensions. With G written as a regular m-adic substitution group, its princi- 
pal holomorph will of course be of order kg*. But there does not then seem to 
be a direct product theorem in terms of groups. 

We turn now to the purely m-adic generalization of holomorph. In ordi- 
nary group theory, due to the presence of the identity, if all of the elements 
of a group H transform a group G into one and the same group, that group 
must be G itself. Hence if G is a regular substitution group, H a substitution 
group on the letters of G, H will be the holomorph of G, or a subgroup thereof. 
This need not be so for m-adic groups with m>2. Let then G be a regular 
m-adic substitution group with m>2, H an m-adic substitution group on the 


ax 
. 
. 


270 E. L. POST : [September 


sequence of I’s of G such that all of the substitutions of H transform G into 
one and the same group G’’(®). It follows that all of the substitutions of H 
transform G’’ into one and the same group G’’’, and so on. Since the m-ads 
of H must transform G as do its elements, it follows that there will be a 
cycle of w—1 distinct, though not necessarily mutually exclusive, m-adic 
groups (G’, G’’,---, G@-»), such that G’ =G, is a divisor of m—1, and 
all the elements of H transform each G into the cyclically following G. Now 
all the m-adic substitutions on the sequence of I’s of G which transform 
-- constitute an m-adic substitution group K con- 
taining H. We shall then call K a holomorph of G, and the holomorph of the 
cycle (G’, G’’,---, G*-»), When »—1=1, K becomes the principal holo- 
morph of G. 

Given the regular G, an m-adic substitution s on the sequence of I's of G 
will be said to be holomorphic if it belongs to some holomorph of G. We then 
readily see that the necessary and sufficient condition that s be holomorphic is 
that s"~—! is in the associated group of the principal holomorph of G. For that 
associated group consists of all the (m—1)-ads on the sequence of I'’s of G 
which transform G into itself. The necessity of the condition then follows from 
the fact that s™ must transform G into the same group that s does, the suffi- 
ciency from the fact that all the elements of the cyclic m-group generated by s 
will then transform G into one and the same group. In particular, the (m!)"~? - 
first order substitutions of degree m are all holomorphic for the regular G of 
degree n. Hence, when the order of the principal holomorph of G is less than 
(n!)™-2, as must be so, for example, in the case of cyclic m-groups of order 
greater than three, we are assured of the existence of a holomorph other than 
the principal holomorph. Clearly, any element s of a holomorph of G deter- 
mines the corresponding cycle (G’, G’’,---, G%-), and hence the holo- 
morph. It follows that all the holomorphs of a given G are mutually exclusive. 

Our next result shows that the order of any holomorph of G is no greater 
than that of the principal holomorph of G. In fact, let K’, K’’,---, K@-» 
be the principal holomorphs of G’, G’’, - - - , G“-», K the holomorph of the 
cycle (G’, G’’,---, G&-»), By writing an element ¢ of Ko as the product 
of m—1 elements of K, we see that ¢ must leave each G“ invariant, and hence 
be in each K%. Conversely, if ¢ is in each K®, it will transform each G@ 
into itself. If then s is in K, és will also be in K, so that ¢ must be in Ko. 
That is, the associated group of the holomorph of (G’, G"’,-+-, G&-") is- 
the logical product of the associated groups of the principal holomorphs of 
G’, G’’,- +--+, G&-», It is readily verified that an s in K actually transforms 
K'>K"'—> - - - and hence also Ki - Kj. 
Hence, a subgroup of Ky invariant under s must be contained in Ko. We thus 
have, in terms of G alone, the associated group of the holomorph of G correspond- 


(®) The reader will note the marked analogy with Corral’s concept of a function pertaining 
to a brigade, that is, one carried into the same function by all the substitutions of the brigade. 


. 


1940] POLYADIC GROUPS 271 


ing to a holomorphic s is the largest group or subgroup of the associated group of 
the principal holomorph of G invariant under s. 

On turning to an order theorem for these m-adic holomorphs, we observe 
first that the holomorph of a cycle (G’, G’’, - - - , G*-») is also tne holomorph of 
the “conjoint cycle” (G', G’’, -- - ,G“-»). For, inasmuch as transforms of com- 
mutative substitutions are commutative, transforms of conjoint regular 
groups are conjoint. Hence, if s transforms G’-G’’—> - - - ~G%-)G’, it 
must transform G’->G’’— - - - ~G“-)=3G’, and conversely. Now let K be 
the holomorph of the cycle (G’, G’’, - -- , G*-»). Each s in K transforms in 
1-1 fashion the elements of G’->G’’— - - - ~G“&-)-—G’, and hence determines 
a p-adic substitution on the w-:1 classes of elements G’, G’’, - - - ,G“-” which 
may be termed a y-adic automorphism of the cycle (G’, G’’,---, 
The class of all such p-adic substitutions on G’, G’’,---, G*-» obtained 
through substitutions in K clearly constitutes an m-adic group which 
we shall term the restricted m-adic group of isomorphisms of the cycle 
(G’, G’’,- ++, G®-»), restricted, both by the possible narrowness of K, and 
by the fact that while an m-adic substitution will transform any one G“ into 
G‘*» according to any simple isomorphism, it need not be able to do this ar- 
bitrarily and simultaneously for each 7. Now s: and 5s: of K will yield the same 
w-adic automorphism of the cycle (G’, G’’, - - - , G“-”) when and only when 
the (m—1)-ad sesj' transforms each element of each G“ into itself, and hence, 
when and only when szsz" is in Go=G/ Gd’ - - - GY-(*). Note that Ko con- 
sists of all (m—1)-ads which transform each G“ into itself, and hence has Go 
for subgroup, one, indeed, invariant under K. By expanding K in cosets as 
regards Go, we then obtain the following result. The order of the holomorph of 
(G’, G’’,--+,G&-) is the product of the order of the crosscut of the associated 
groups of the conjoints of G’, G’’,---, G%-» and the order of the restricted 
m-adic group of isomorphisms of (G’, G’’,---, 

This result is weaker than the result for the principal holomorph of G in 
two ways. On the one hand, the order of Gp replaces the order of G itself. 
More significantly, in the case of the principal holomorph, we identified 
(Ko)1 with the group of isomorphisms of G. Note that there both K and Ky 
yielded every possible automorphism of G. In the present case Ko transforms 
each G“* into itself, the distinct transformations being (u—1)-ads of y-adic 
substitutions on G’, G’’, - - - ,G“~, and constituting the associated group of 
the restricted m-adic group of isomorphisms of the cycle (G’, G’’, - - -,G-»), 
If then we ask whether (Ko): can be identified with this associated restricted 
group of isomorphisms, we find that while no two members of (Ko): can 
transform the G“’s in the same way, for (Ko) to transform the G’s in every 
way that Ky does, it is necessary and sufficient that Ko carry ay into no other 
letters than does its subgroups Go. We have not succeeded in answering the 


(*) Product here is logical product. 


272 E. L. POST , [September 


question thus posed; and hence, whether (Ko), or any other subgroup of Ko, 
can be identified as the associated restricted group of isomorphisms of the 
cycle (G’, G’’,-- - , G*-») remains one of our unsolved problems(*). 

17. m-adic groups of yu-adic substitutions. The present extension of the 
concept of m-adic substitution group is indispensable for a self-contained the- 
ory of primitivity, our next topic. This extension has the advantage of includ- 
ing m-adic groups of ordinary substitutions in its scope. However, the fact 
that any abstract m-adic group can be represented as a regular m-adic sub- 
stitution group is perhaps sufficient reason for our restricting the explicit 
study of this wider class of substitution groups to the next section. 

Given a cycle of equivalent classes Ts, -- +, Iy-1, not only will 
the product of yw w-adic substitutions on these I’s be a substitution of the 
same kind, but also the product of any m such substitutions, provided m is 
in the form k(u—1)+1. We are thus led to the concept of an m-adic group of 
p-adic substitutions, or (m, wu) substitution group, with m and u subject to the 
sole condition that u—1 be a divisor of m—1. We have already met this con- 
cept in the last section where the corresponding I's, G’, G’’, - - -, G“-, while 
distinct, were probably not necessarily mutually exclusive(®*). In what fol- 
lows, for simplicity, as in our previous development, we shall assume the I'’s 
to be mutually exclusive. 

It is not difficult to review our previous work to see how much goes over 
to (m, w) substitution groups. The chief failure turns out to be the extension 
of Jordan’s theorem on regular groups. Particular mention must be made of 
the structure of the containing group of an (m, u) group G. Letting, for sim- 
plicity, G* symbolize the containing ordinary group of G generated by the 
elements of G, G* will now be of some index k which is a divisor of m—1 anda 
multiple of 1—1. We must now distinguish between z-ads of G and 7-ads in G*, 
the former being the products of any 7 substitutions in G, the latter all prod- 
ucts in G* of 7 w-adic substitutions. In particular, there will be k/(u—1) cosets 
in G* consisting of (u—1)-ads, one and only one of these cosets being Go. 

In connection with the next section, the extension of the concept of transi- 
tivity to (m, u) substitution groups is of most importance. Actually, our defi- 
nition of transitivity as applied to m-adic substitution groups can be restated 


(5) The above theory of m-adic holomorphs can be paraphrased for ordinary groups. Thus, 
if s is a substitution on the letters of an ordinary regular group G’, but not in the holomorph 
of G’, and if s"— is the first positive power of s in the holomorph of G’, then a cycle of regular 
groups G’, G”’,- , is determined such that, under s, - - The 
set of all substitutions on the letters of G’ thus transforming this now given cycle of G’s will 
then constitute an m-adic group of ordinary substitutions, which may then be called an m-adic 
holomorph of G. The above theory, in somewhat simpler form, will then go over. 

(5) On the other hand, the most general possibility is still not there illustrated; for a given 
element is carried into a single element independently of the G® of which it is an element. 
Note that our last footnote further introduced an (m, 2) substitution group as m-adic holo- 
morph of an ordinary regular group. 


1940] POLYADIC GROUPS 273 


verbatim for (m, ») substitution groups. It is then readily verified that all 
of the results of §13 go over, with the possible replacement of m by yu, with one 
exception. And that is that the transitivity of G* no longer assures the transi- 
tivity of G(*"). 

18. Primitive and imprimitive (m, u) substitution groups. The distinct 
sets {ai;} of §15 are transformed as units under all the substitutions of the 
containing group G* of the transitive m-adic substitution group G, and hence 
under the substitutions of G. We recall that each set {a;;} had a letters in 
each of x I'’s whose subscripts formed an arithmetic progression, a being a 
divisor of n, the degree of G, x of m—1. Let v=n/a, m’—1=(m—1)/x. Each 
set {ai;} then has letters in one and only one of the first (m’—1) I's, there 
being v sets for each such I’. The (m’—1)y distinct sets {a,;} thus fall into 
m'—1 mutually exclusive classes -- - , of members each. As 
any m-adic substitution s in G transforms each I; into I';4;, it will:in 1-1 fash- 
ion transform the members of --- , and so de- 
fine an m’-adic substitution on Ty, Ty,---, I'y-_,. The totality of these 
m'-adic substitutions will then constitute an (m, m’) substitution group G’ 
of degree v. As G is transitive, it follows that G’ is transitive, there being, as 
in the ordinary case, a (1, V) isomorphism between G’ and G. With a restric- 
tion to be noted later, G will be said to be imprimitive with systems of im- 
primitivity {a;;} whenever 1<(m’—1)v<(m—1)n, this however, as in the 
ordinary case, being but an example of the general concept of imprimitivity. 

We thus see that even if we start with transitive m-adic substitution 
groups, i.e., (m, m) groups, we are led to (m, u) groups. This extension is how- 
ever sufficient for our purpose. For if we start with a transitive (m, wu) sub- 
stitution group G, and define the sets {az} as before, we obtain, by the same 
argument, an analogous distribution theorem, and then, as above, a transi- 
tive (m, uw’) substitution group G’ with uw’ —1 a divisor of u—1. 

In general, then, let G be a transitive (m, w) substitution group of degree n 
on --- , with, of course, a divisor of m—1, and let there be 
some separation of the (u—1)m letters of the [’s into mutually exclusive 
classes such that these classes are transformed as units under all the substitu- 
tions of G, and hence of G*. An entirely analogous argument to the one used in 
determining the distribution of the letters in the sets {a,;} leads to a corre- 
sponding conclusion here. That is, each class consists of the same number xa 
of letters, with a a divisor of m, x of u—1, there being a letters in each of x I’’s 
whose subscripts are in arithmetic progression. As with the {a;;}’s, each such 


(57) On the other hand, if G* is transitive, we may form a sequence of mutually exclusive 
Te’, such that G is a transitive (m, u’) group on the I’’s. Here 
multiple of 1.—1, and a divisor of m—1; while the I’’s are successively subclasses of the I'’s 
run through cyclically (u’—1)/(u—1) times, and together exhausting the I’’s. If we call such 
an (m, «)-group G semi-transitive, then the main result of §14 goes over for an arbitrary (m, u)- 
group if we replace the transitive constituent groups by semi-transitive constituent groups. 


274 E. L. POST : [September 


separation of the letters of the I'’s leads to a transitive (m, uw’) substitution 
group G’ of degree v, where v=n/a, uw’ —1=(u—1)/x. The numbers a and x, 
of course, depend on the separation in question. 

In accordance with the standard definition we would then define G to be 
imprimitive if some such separation into mutually exclusive classes is possible 
with 1<(u’—1)v<(u—1)n, the classes then being the corresponding systems 
of imprimitivity of G. This restriction is equivalent to x and @ not being both 
one, or simultaneously equal to .—1 and respectively. But then G would 
always be imprimitive for 4>2, since its substitutions transform the I’s 
themselves as units. We therefore exclude the case a=n, and thus have the 
following definition. G will be said to be imprimitive if it admits systems of im- 
primitivity for which a<n, x and a not both unity; otherwise G will be said to 
be primitive. 

The rather artificial restriction a<mn is entirely natural in the case y=2, 
and, indeed, we have here the only complete generalizations of the primitivity 
theorems of ordinary groups. G is now an m-adic group of ordinary substitu- 
tions on letters which may then be written ai, a2, --- , dn. It is easily seen 
that if G is transitive, so are G* and Go, the converse however holding only 
for Go. On the other hand, with G transitive, if G is imprimitive, so are G* 
and Go, the converse holding only for G*. Thus, with m=n-+1, G can be the 
intransitive group consisting of the single substitution (ade - - - a,), while G* 
is transitive. And the following is an example of a transitive primitive G for 
which Gp is imprimitive. Let Gp be the transitive imprimitive group of order 
four: 1, Then s=(a,;a2a3) transforms Go 
into itself, while s*=1 is in Go. Hence G=QGgs is a transitive tetradic group of 
ordinary substitutions, and is easily seen to be primitive. 

Turning to the ordinary theorems on primitivity, with G thus a transitive 
(m, 2) group, there will be at least one substitution in G carrying a, into itself, 
and the totality of these substitutions will constitute a subgroup G; of G. We 
then have the complete analogue of the corresponding theorem for ordinary 
substitution groups, i.e., a@ necessary and sufficient condition that a transitive 
m-adic group G of ordinary substitutions is imprimitive is that G, is contained 
in a larger subgroup of G. While this can be proved by applying the ordinary 
theorem to G*, the ordinary proof (5*) can here be directly carried over. Thus, 
if G is imprimitive, the substitutions of G transforming the system of imprimi- 
tivity of which a; is a member into itself constitute a subgroup K of G con- - 
taining G;, and larger than G;. Conversely, if K is a subgroup of the transitive 
G containing G,, and larger than G;, expand G in right cosets as regards K. 
Each coset will consist of the same number a>1 of right cosets of G as re- 
gards G,, and hence will carry a; into a letters, distinct for each coset, and will 

(®) Rather what the proof of the more general theorem of Finite Groups, page 39, would 


become if given directly for the more special result. We have interchanged the order of the two 
results. 


| 


1940} POLYADIC GROUPS 275 


consist of all the substitutions of G carrying a; into one of those letters. 
Finally, any substitution s of G will transform these mutually exclusive sets 
of letters as units. For if Kos; is the coset carrying a; into one of these sets, 
that set will be transformed by s as a; is by Kos;s. But Kosis is the same as 
SoKoS2, with so in G;, sz some element in G, and hence transforms a; into that 
one of the above sets into which the coset Kos: carries a. ° 

We shall prove in §24 that if an element or subgroup of an m-group G is 
transformed by the elements of G, the resulting set of distinct transforms con- 
stitutes a “complete set of conjugates” under G, and is transformed by the 
elements of G according to a transitive m-adic group of ordinary substitutions 
having a (1, V) isomorphism with G. We again then are concerned with the 
case 4=2; and either by applying the preceding result in conjunction with 
that isomorphism, or by directly extending the ordinary proof as was done 
above, we again obtain the complete analogue of the corresponding ordinary 
group theorem(®). A necessary and sufficient condition that a complete set of 
conjugate elements or subgroups under an m-group G of an element or subgroup 
of G is transformed under G according to an imprimitive m-adic group of ordi- 
nary substitutions is that the largest subgroup of G which transforms into itself 
one of these elements or subgroups is contained in a larger subgroup of G. 

When G is a transitive (m, uw) group with 4 >2 we no longer have an ana- 
logue of G; for G itself. We must therefore go outside of G for theorems on 
imprimitivity. G* will still be transitive; and apart from the restriction a <n, 
a set of systems of imprimitivity of either G or G* will also be one of the other. 
Our description of the possible systems of imprimitivity of G therefore applies 
equally well to G*, and we conclude that G will be imprimitive when and only 
when G* admits a set of systems of imprimitivity for which a<n. As G* is an 
ordinary transitive substitution group, we easily supplement the standard 
result concerning its imprimitivity to obtain the following. A transitive (m, pu) 
group G is imprimitive when and only when G* has a subgroup containing Gi, 
larger than Gii, but not containing Go. Gi is of course the subgroup of G* con- 
sisting of all of its substitutions omitting ay. In proving this result we observe 
that as a consequence of the transitivity of G the substitutions of Go will carry 
any letter into every letter in its I’. If then G, and hence G*, is imprimitive, 
the subgroup K of G*, composed of all the substitutions of G* which trans- 
form the system of imprimitivity of which ay is a member into itself, satisfies 
the conditions of the theorem. For K is known to be a subgroup of G* con- 
taining Gi, and larger than Gy. And as it can carry ay into only a<n letters 
of T, it cannot contain Go. Conversely, if K is a subgroup of G* satisfying the 
conditions of the theorem, the letters into which the substitutions of K carry 
ay, are known to form one of a set of systems of imprimitivity of G*. As K will 
then contain all the substitutions of G* which carry a, into any letter that 


(5) At least as stated on page 39, Finite Groups. 


276 E. L. POST [September 


one substitution of K carries a, into, could it carry a, into all the letters of T; 
it would contain Go, contrary to hypothesis. Hence, the a of the resulting 
systems of imprimitivity of G is less than n, whence G too is imprimitive. 

A criterion for the imprimitivity of G in terms of Go would be preferable 
to one in terms of G*. Our example of a primitive (m, 2) group whose associ- 
ated group was imprimitive precludes such a criterion for an arbitrary (m, wu) 
group. However when 4»=m we do have the following partial criterion in 
terms of, better than Go, the associated constituent groups of G. A transitive 
m-adic substitution group G admits systems of imprimitivity with a>1 when and 
only when the associated constituent group Gg is imprimitive. In fact, systems 
of imprimitivity of G must be permuted as units under Go. Hence the por- 
tions of these systems in I, are permuted as units under Gj. As a>1 by 
hypothesis, and a <n by definition, we thus have a set of systems of imprimitiv- 
ity of Gi? . Conversely, given a set of systems of imprimitivity of Gj, any s of G 
will transform Gj - - ,and hence will successively trans- 
form the systems of imprimitivity of Gj into systems of imprimitivity of 

j’,-++,G@"-”, The result of transforming these systems of imprimitivity 
of Gi"-» by s is the same as that of transforming the given systems of im- 
primitivity of Gi’ by s™—!. As s™—' is in Go, it transforms the systems of im- 
primitivity of Gj as units. Hence s transforms the totality of systems of 
imprimitivity of Gj, Gi’, -- - , G&"-” as units. As Go does the same, so will 
G=Gys which is therefore imprimitive with 1<a (<n). 

For the exceptiondl case a=1 we have to return to G*. The same consid- 
erations that gave us our general criterion for the imprimitivity of a transitive 
(m, w) group yield the following result. A transitive (m, uw) group G admits 
systems of imprimitivity with a=1 when and only when G* has a subgroup con- 
taining Gyi, larger than Gii, but having no other substitutions than those of Gi 
that carry each T into itself. The last condition is equivalent to the crosscut 
of the subgroup in question and Gp being identical with (Go), the crosscut 
of Gi, and Gz, and hence, for an m-adic substitution group G, with Gj. 

Though all of the development of the next section, and, with certain re- 
strictions, of the one following, can be given for (m, uw) substitution groups, 
we restrict ourselves, for the sake of simplicity, to m-adic substitution groups, 
i.e., (m, m) groups. 

19. Multiple transitivity ; cyclically transitive m-adic substitution groups. 
Various extensions of the concept of multiple transitivity suggest themselves. . 
According to the simplest, an m-adic substitution group G would be said to 
be r-fold transitive if any r letters belonging to any one I can be transformed 
into any r letters of the succeeding I by the substitutions of the group. It is 
readily proved that a necessary and sufficient condition that G be thus r-fold 
transitive is that the associated constituent groups -- - , (or 
any one of them) be r-fold transitive in the ordinary sense. Since the order 
of G is a divisor of the order of Go, and hence of G, it follows from the corre- 


— 
4 . 
| 
q 


1940] POLYADIC GROUPS 277 


sponding ordinary group result that the order of an r-fold transitive m-adic 
substitution group of degree m is a multiple of m(m—1) - - - (n—r+1). 

Of special interest in polyadic theory is the type of multiple transitivity 
we term cyclic transitivity. An m-adic substitution group G will be said to 
be cyclically transitive if, given any two selections from the classes of letters 
T,, T2,---, Im—1, some substitution of G will carry the letters of one se- 
lection into the letters of the other. Actually then, if one selection is 
* * » the other a1%,, » the substitution 
will carry @1;,—@2k,, sky * » Every cyclically transi- 
tive m-adic substitution group is then transitive, and, indeed, for m =2 cyclic 
transitivity reduces to transitivity. The symmetric and alternating m-adic 
groups of degree m, previously observed to be transitive—in the latter cases 
at least for n >2—are now seen to be cyclically transitive. 

The m—1 I’’s, of 1 letters each, give rise to n™—! selections which we shall 
call cycles. Any m-adic substitution s on the I'’s will merely permute these 
cycles, and hence will determine an ordinary substitution on these n”—' cycles 
as new “permutants.” Since this relationship is preserved under multiplica- 
tion, the members of an m-adic substitution group G of degree m will thus 
give rise to substitutions on the cycles forming an m-adic group G’ of ordinary 
substitutions, of degree n”—!, isomorphic with G. Clearly, different m-adic 
substitutions yield different substitutions of the cycles. Hence G’ is indeed 
simply isomorphic with G. In particular, then, G’ and G are of the same order. 
Finally, if G is cyclically transitive, G’ will be transitive, and, indeed, con- 
versely. Since the order of an m-adic transitive group of ordinary substitu- 
tions is a multiple of its degree, we have, as our first result, the order of a 
cyclically transitive m-adic substitution group of degree n is a multiple of n™—'. 

We have seen that transitive m-adic groups of ordinary substitutions have 
the complete analogue of the G; of ordinary transitive groups. Actually, all of 
the corresponding theory goes over. In our simple isomorphism between G 
and G’, the subgroup of G’ consisting of all of its substitutions “omitting” 
a given cycle C will correspond to the subgroup Gc of G consisting of all of its 
substitutions which carry C into itself. We shall call Ge the cycle subgroup of G, 
corresponding to C. Actually, if C is the selection 41;,, @(m—1)im— 
it will be transformed into itself according to the cyclic substitution 
(@1;,2j, * * * @¢m—1)in-,) by all the substitutions of Gc. Hence, if the m-adic 
substitutions of G be written as ordinary substitutions on (m—1)n letters 
in cycle form, Ge will consist of those substitutions of G which have this cyclic 
substitution as component. It will be convenient to speak of these substitu- 
tions as having the cycle C. Clearly, then, an m-adic substitution of degree n 
cannot have more than n cycles. 

Each of the n™~—! cycles yields thus a corresponding cycle subgroup of the 
cyclically transitive G. The simple isomorphism between G and G’ then im- 
mediately transforms the corresponding properties of G’ to yield, among 


278 E. L. POST [September 


others, the following results on G. The order of each cycle subgroup of G is 
equal to the order of G divided by n™—. The cycle subgroups of G are con- 
jugate, forming a complete set of conjugates under the substitutions of G. 
If all the substitutions of a given cycle subgroup have exactly a cycles in com- 
mon, and they have one cycle in common by definition, then the n™—! cycles 
can be separated into mutually exclusive sets of a cycles each such that differ- 
ent cycles yield the same cycle subgroup when and only when they belong to 
the same set. 

There are thus ™~!/a distinct cycle subgroups of G. The only information 
that G’ yields concerning a is that it is a divisor of ™~'. However, our ob- 
servation that an m-adic substitution of degree m cannot have more than n 
cycles shows that aSn. Hence, a cyclically transitive m-adic substitution group 
of degree n has a number N of cycle subgroups with N=n™~ and a divisor of 
n™—!, Whether N is actually a multiple of n™~, i.e., a a divisor of n, is another 
of our unsolved problems. 

20. Class of an m-adic substitution group. The class of an ordinary sub- 
stitution group is the smallest number of letters appearing in any of its sub- 
stitutions, other than the identity, when those substitutions are written in 
cycle form. Since the substitutions of an m-adic substitution group G never 
carry a letter into itself when m>2, we are led to define the class of G as the 
class of its associated group Go. This also is the class of its containing group 
G* (5), 

With this definition most of the elementary theory of class goes over to 
m-adic substitution groups. We have almost immediately that the m-adic 
symmetric group of degree n, n>1, is a primitive group of class 2, while the 
m-adic alternating groups of degree n, n>2, are primitive groups of class 3. 
That these m-adic groups are primitive follows from the fact that their con- 
stituent associated groups are either the symmetric, or alternating, ordinary 
groups of degree m, and hence primitive, so that the m-adic groups do not 
admit systems of imprimitivity with a>1; and, being cyclically transitive, 
they cannot admit systems of imprimitivity with a=1. As for their class, 
they are clearly at most of the class indicated. And could an alternating group 
actually be of class 2, the e-subgroup of its corresponding 6-subgroup would 
have an €-sequence with one —1; but then the 6-subgroup would be the com- 
plete 5-group, and hence the given group not an alternating group, but the 
symmetric group. ; 

We now prove that, as in the standard theory, the converses of these re- 
sults also hold. First then let G be a primitive m-adic substitution group of 
degree m and of class 2. On the one hand, its associated constituent group G¢ 
will be primitive; on the other hand, its associated group Go will have some 
substitution whose component in each G{ but one is the identity, and in 


(®) Not necessarily so, however, for (m, u)-groups with n<m. 


1940] POLYADIC GROUPS 279 


that one a transposition. By the invariance of Gp under G, we see that Gy 
has a substitution ¢o of the form ¢g -1 - - - 1, with tf in Gj and a transposition. 
Now let Gj be that subgroup(*) of G/ composed of all the substitutions ¢’ of 
Gé for which t’-1---1 is in Go. The subgroup G/ is clearly an invariant 
subgroup of Go, and hence of G;, and it has the transposition ti. Now the 
standard proof of the fact that a primitive (ordinary) group of class 2 is the 
corresponding symmetric group also yields the following more general state- 
ment. An invariant subgroup(®) of class 2 of a primitive group is the cor- 
responding symmetric group. Hence Gj is the symmetric group of degree n. 
G, therefore has among its elements every substitution of the form #’-1 - - - 1. 
Since Go is invariant under G, it also has every substitution of the form 
1---#@..-.1, for each i, and hence every substitution of the form 
t't’’ - - - £™-), Go is therefore the associated group of the m-adic symmetric 
group of degree m, and hence G the symmetric group itself. Hence, every primi- 
tive m-adic substitution group of class 2 and degree n is the corresponding sym- 
metric group, and conversely for n>1. 

If G is a primitive m-adic substitution group of degree m and class 3, we 
have as before that G)’ is primitive, while Go has a substitution of the form 
tj -1---+1 with ¢ of the form abc, the last since the substitution of class 3 
in Go must consist of a single cycle of three letters which, in turn, must then 
belong to a single I’. Defining Gj as before, we see that G/ is of class 3, and 
hence, by the corresponding extension of the standard result, is the alternat- 
ing group of degree nm. We therefore conclude that Gp has, perhaps among 
others, every substitution of the form ¢’t’’ - - - ¢™-» with the ¢“’s positive 
substitutions. Go therefore has every possible substitution corresponding to 
the e-sequence (+1, +1,---, +1), and hence every possible substitution 
for each of the e-sequences of its substitutions. It is therefore the associated 
group of an alternating group, i.e., G is an alternating group. Hence, every 
primitive m-adic substitution group of degree n and of class 3 is an alternating 
group of degree n, and conversely for n>2. 

Actually two cases arise as far as Gy is concerned. When the above found 
substitutions of Go are its only substitutions, (+1, +1,---, +1) is its only 
€-sequence, G is an alternating group whose 6-subgroup is of the first order, 
while G¢ is identical with Gj , and hence is itself the alternating group. Other- 
wise, Gj will be larger than Gj, while containing it, and hence will be the 
symmetric group, while G will be an alternating group whose 6-subgroup is 
of order greater than one. Note also that in both of the above results the hy- 
pothesis of the primitivity of G was used only in deducing the primitivity 
of Gg . We therefore conclude that there does not exist an m-adic substitution 
group G of class 2 or 3 for which G is imprimitive, Gi primitive. 


(*) If not Go’ itself. 
(®) Actually improper, therefore. 


280 E. L. POST : [September 


Let now G be a primitive m-adic group of degree m and of class p, p a prime 
greater than 3. As before, the substitution of class p in Gp must consist of a 
single cycle of letters which therefore belong to a single [. Hence n2p. Fur- 
thermore, Gj will have a substitution of class » for element, and hence be of 
class p. Finally G is primitive, with Gj as invariant subgroup. With the 
corresponding ordinary proof generalized as in the preceding cases, we then 
find that Gj is (n—p+1)-fold transitive. The remainder of the standard proof 
is then directly applicable to Gi and shows that m cannot be greater than 
p+2. Hence, if a primitive m-adic substitution group is of class p, p being a 
prime number greater than 3, its degree can only be p, p+1 or p+2. Note that 
actually Gj is then itself primitive—immediately so for n=p, and as a con- 
sequence of its being more than simply transitive for n= p+1 or p+2. Hence, 
in each of these cases, Gj is the unique primitive ordinary group of class p 
and degree n. 

We consider in detail only the case n=. G/ is then the group of order », 
as is also each G, defined in analogous fashion. Each G0 is therefore a cyclic 
group, and is, in fact, generated by a single cycle of the p letters of T,;. By 
relettering the members of the I'’s we may therefore assume that G0 is gen- 
erated by the substitution = - - - a;p). Now any substitution ¢ of Go 
will transform each G°® into itself, and hence will transform ¢, the generator 
of Go, into some power v‘® of itself, with v =1, 2,---, p—1. Hence, with 
each ¢t in Gp we can thus associate a v-sequence (v’, v’’, - - - , v™—-»). Likewise, 
if s is any substitution in G, s will transform each G® into G+”. It will there- 
fore transform each #{? into some power yu“ of +, w=1, 2,---, p—1. 
Hence, with each s in G we can thus associate a u-sequence [u’,u’’, - J. 
Since Go has the substitution 1 - - - © - - - 1 whenever G has the substitu- 


tion ¢‘, we see that Go has the invariant subgroup 
Go = --- 


the direct product of the G{®’s, when it is not Gp itself. Now G® consists of 
all the substitutions on the letters of I’; that transform 4 into itself. It fol- 
lows that Gp consists of all the (m—1)-ads, consequently p™—! in number, 
which transform each ¢ into itself. Go, therefore, has among its elements 
each of the p”~-! (m—1)-ads with which we can associate the v-sequence 
(1,1,---, 1). By expanding Gp in cosets as regards Go, we then easily verify 


that Go likewise has each of the (m—1)-ads with which we can associate the - 


v-sequence of any one of its members, there being exactly p"-! (m—1)-ads 
for each v-sequence. Likewise, by expanding G in cosets as regards Go, which 
is invariant under G, we find that G has every m-adic substitution on the I'’s 
with which we can associate the u-sequence of any one of its members, there 
being exactly p”~! such substitutions for each u-sequence. 

G is therefore determined by the set of u-sequences of its members. Ac- 
tually, if s;in G has the y-sequence [u/, --- , ], then s=sy52 - Sm, 


\ 


1940] POLYADIC GROUPS 281 


also in G, will have the u-sequence [yu’, --- , ] given by the equa- 
tions 


= 


= Mi Me 


(m—1) (m—1) 7 (m—1) 
M2 * bm ’ 


the products being reduced modulo # to one of the numbers 1, 2,---, p—1. 
It follows that the u-sequences of the members of G constitute an m-adic 
group under this m-adic operation isomorphic with G, and, indeed, simply 
isomorphic with the quotient group G/Go. G is therefore determined by its 
corresponding “u-subgroup.” 

This suggests a development analogous to that of the symmetric and al- 
ternating groups, and their relationship to the complete 6-group. Let P, the 
“symmetric power group of degree ~,” be the m-adic substitution group of 
degree consisting of all m-adic substitutions on the I's that transform each 
t = (adie - - - dip) into a power of Its associated group Po will then 
consist of all (m—1)-ads on the I'’s transforming each 4 into a power of 
itself. Note that actually P is ident’cal with the preceding G‘®, Py with Go, 
a fact that merely emphasizes the fact that for the class of groups under con- 
sideration these groups are independent of G. With each ¢ in Py we can asso- 
ciate a v-sequence, with each s in P, a w-sequence. Clearly, for each of the 
(p—1)”— possible v-sequences there is at least one ¢ in Po having that v-se- 
quence, and hence, as for Go, p"~! such ?’s. Pois therefore of order [p(p—1) |™—'. 
Hence the symmetric power group P is also of order [p(p—1)]™—', having 
p”—! substitutions for each of the (p—1)™~' possible u-sequences. Actually, 
if so is the m-adic substitution carrying each a;; into @¢41);, it will be in P 
with the u-sequence [1, 1,---, 1], so that we can write P=P,so; and if in 
s =tso we let ¢ run through all substitutions in Po with a given v-sequence, s 
will run through all substitutions in P having that v-sequence for u-sequence. 

Under the above m-adic operation the (p-—1)™~' possible u-sequences con- 
stitute an m-adic group which we may call the complete yu-group. As in the 
case of the complete 6-group, the associated group of the complete u-group 
may be considered to have the (p—1)”~'! v-sequences as elements under the 
dyadic operation vi’,---, (ng, vd’, =(vi od, vi’ vd’, 

, vim-Yy-)), the v-sequence corresponding to an (m—1)-ad of y-se- 
quences being given by the above operation on u-sequences with the u’s 
omitted. The complete y-group is therefore semi-abelian; and by the use 
of the p-sequence [1, 1,---, 1] the unique v-sequence into which 
(v’, v’’,--+, v™-) is transformed by every u-sequence is again found 
to be (v"—), , 

Corresponding to each subgroup of the complete u-group there will be an 


, 


282 E. L. POST : [September 


“alternating power group,” the m-adic substitution group of degree p con- 
sisting of all the m-adic substitutions on the I’’s with y-sequences in the y-sub- 
group under consideration. Each of our G’s is therefore an alternating power 
group(*). To complete our investigation within its present scope we need 
merely find which of the alternating power groups are primitive groups of 
class p. Actually they are all primitive. For their Gj is the primitive Py, 
so that their Gj is primitive. And their Gp is always Po, which can carry 
any selection of letters chosen from the I’’s into any other selection, so that 
in fact, they are cyclically transitive. As for their class, it is immediately 
seen to be at most p. Now actually a substitution on the letters of IT’; carrying 
t = (aaj - - - dip) into a power of itself other than the first must be of class 
p—1. It follows that an alternating power group of degree > is of class less 
than p, in fact p—1, when and only when the associated group. of its u-sub- 
group has a v-sequence with one and only one number not unity. This is easily 
transformed into a condition on the w-subgroup itself to yield the following 
result. The primitive groups of class p and degree p, p being a prime greater 
than 3, are the alternating power groups of degree p whose y-subgroups do not 
have a pair of u-sequences differing in one and only one component(*). 


B. FINITE ABSTRACT POLYADIC GROUPS 


21. Cyclic polyadic groups; ordinary theory(**). Given the m-adic opera- 
tion c, we define the m-adic powers of an element s under c inductively as 
follows. s itself will be rewritten s!; and having s!], we define s+) as 
c(s - - + ss), If then s™! be written out in full, 2 is the number of c’s occur- 
ring in the resulting extended operation, the number of s’s being m(m—1)-+1. 
By the associative law it follows that any extended operation involving n c’s 
and but the single element s repeated can be rewritten in the form s'!, We 
thus easily obtain the following m-adic power laws: 


Note that for m=2 our mth power is the ordinary (n+1)-st power(®). 


(®) Unless it were P itself. But P is readily seen to be of class p—1. 

(*) The actual problem of determining the subgroups of the complete y-group remains 
unsolved. Gill has pointed out to the writer that while the problem of determining the asso- 
ciated groups of these u-subgroups can superficially be expressed as a problem in V.A.G.’s, 
actually the theory is now inapplicable, since the coefficients of the polynomials no longer form 
a field. 

(*) For the special case m =3, the results of the present section reduce to those given by 
Lehmer. Likewise those of the next section involving mere reducibility, now of necessity to a 
2-group. 

(®) By contrast, Dérnte writes a* in usual notation with, however, z subject to the re- 
striction z=1 (mod m—1). While our laws of powers are, as a result, more complicated than 
Dérnte’s, we find great comfort in the fact that our s!"] is an “m-adic element” for every positive 
integral, or zero, m. Our lack of negative m-adic powers could easily be supplied. 


1940] POLYADIC GROUPS 283 


If s is an element of an m-adic group K, each of its m-adic powers will 
represent elements of K. With K a finite group we therefore must have 
for some and n>0, Since can be rewritten 
c(simls - ssl*-11), it follows that {s, is an identity, whence 


we have 
s{n] = §, 


The smallest postive integral value of for which this equation holds will 
be called the (m-adic) order of s. If then s is of order g, the sequence of its 
m-adic powers starts with g distinct elements which are then 
repeated in order. It follows on the one hand that s!! =s when and only when 
n is a multiple of g; and, more generally, that s!1] =s'"2] when and only when 
,— M2 is a multiple of g. On the other hand, since but a finite number of ele- 
ments are involved, our first law of m-adic powers shows that the g distinct 
elements constitute an abelian m-adic group G of order g which may then be 
called the cyclic m-adic group generated by s. The order of s is therefore equal 
to the order of the cyclic group it generates. Again by the first law of m-adic 
powers it is immediately seen that two cyclic m-groups of the same order are 
simply isomorphic. Furthermore, the same law shows that apart from an as- 
sumed m-group K, g distinct elements So, 51, -- + , Sy-1, subject to the m-adic 
operation obtained by writing s,=s'™, with s!&l=s, constitute an m-group 
which is then the cyclic m-group of order g generated by s=so. Hence, as in 
ordinary group theory, we may say there is one and only one cyclic m-group 
whose order is an arbitrary natural number(*’). 

Let then G be the cyclic m-group of order g, s a generator of G. We first 
ask for the order of any power s!™! of s. This will be the least value of N for 
which (s!1)(¥]=s["], hence the least value of N for which (m—1)nN+N-+n 
—n=[(m—1)n+1]N is a multiple of g. It follows that the order of s'! is 
equal to the order of s divided by the highest common factor of (m—1)n+1 and 
the order of s. In particular, the order of s'! will be the same as the order of s 
when and only when (m—1)n-+1 is prime to the order of s. Hence an element 
s ts generated by those and only those of its m-adic powers s') for which 
(m—1)n-+1 ts prime to the order of s. 

We can now determine what orders the elements of G can have. y will be 
the order of an element of G if y=g/d, d=H.C.F. [(m—1)n+1, g] for some n. 
It is necessary then that d be a divisor of g, and prime to m—1. We now 
show that this is also sufficient. We have to find, then, an and k such that 
(m—1)n+1=kd, g=yd with k relatively prime to y. Since m—1 is prime 
to d by hypothesis, for some =m, k=ko, we will have (m—1)mo+1=hod. 


(*7) The following discussion tacitly assumes that a symbol representing the order of an 
element or group is restricted to positive integral values, one representing an m-adic power to 
non-negative integral values. On the other hand, symbols entering into a diophantine equation 
may at first be allowed to assume arbitrary integral values which are then restricted in the 
above manner as the need arises. 


284 E. L. POST , [September 


The general solution of (m—1)n+1=kd is then given by n=m)+)Ad, k=ko 
+d(m—1) with arbitrary \. Now the particular solution shows ky to be prime 
to m—1. Hence the arithmetic progression kg +(m—1) has, indeed, an infi- 
nite number of primes, and hence certainly a number prime to y as was to 
be proved. We thus have the following result. A cyclic m-group of order g has 
at least one element of every order y such that y is a divisor of g, and g/¥y is prime 
to m—1, and no element of any other orders. In particular, a cyclic m-group of 
order g has a first order element when and only when g is prime to m—1. 

We can now generalize the ordinary cyclic group argument to prove the 
following. A cyclic m-group of order g has one and only one subgroup whose 
order is any given divisor y of g such that g/y is prime to m—1, and no others. 
The one subgroup is immediately yielded by the cyclic subgroup generated 
by an element of order y, whose existence is insured by the preceding result. 
For the converse, consider any subgroup of the given cyclic group, and let 
its order be y. By Lagrange’s theorem extended, y is a divisor of g. By the 
same theorem, each element s of the subgroup has an order which is a divisor 
of y, and hence must satisfy the equation s'¥]=s. Now consider all the ele- 
ments of the given cyclic group, generated, say, by So, that satisfy this equa- 
tion. If s=s"), we have, as in a preceding argument, that 7 [(m—1)n+1]=kg 
for some k, and hence, with g/y=d, that (m—1)n+1=kd—and conversely. 
We first see that d is prime to m—1, and hence that ¥ is the order of a cyclic 
subgroup of the given group. Furthermore, since yd=g, our general solution 
n=notdd, k=ko+A(m—1), of the equation (m—1)n+1=hkd shows that ex- 
actly y such n’s are to be found with values in the set 0, 1, 2,---, g—1. 
Hence the elements s satisfying the equation s!7] =s are exactly y in number, 
and consequently must be the y elements of the above cyclic subgroup of 
order y. Our assumed subgroup of order y must therefore be that cyclic sub- 
group, whence our result. 

From this proof flow a number of corollaries. We have immediately that 
every subgroup of a cyclic polyadic group is cyclic. Furthermore, our proof 
shows that an element of given order of a cyclic group is contained in those 
and only those subgroups of the cyclic group whose orders are multiples of 
the order of the element. It follows, on the one hand, that the necessary 
and sufficient condition that one element of a cyclic group generate a second 
is that the order of the first be a multiple of the order of the second. On the 
other hand, we see that two subgroups of a cyclic polyadic group intersect 
in the subgroup whose order is the highest common factor of the orders of 
the given subgroups, and generate the subgroup whose order is the least com- 
mon multiple of those orders. 

Apart from the possible orders of elements and subgroups of a cyclic poly- 
adic group the above results are the same as for ordinary cyclic groups. Our 
condition on those possible orders y can be transformed into the following 
more usable form. Let go be the largest divisor of g prime to m—1, and let 


| 


1940] POLYADIC GROUPS 285 


o=g/g0. Then the cyclic m-group of order g has at least one element, and exactly 
one subgroup, of those and only those orders y for which y = 5yo, 5 a divisor.of go. 
In fact, if y is a divisor of g with g/y prime to m—1 as per our original condi- 
tion, g/y, being a divisor of g prime to m—1, must be a divisor of go. Hence 
go= 6(g/y) with 6 a divisor of go, whence y = yo. Conversely, if y = dy with 6 
a divisor of go, g/y =g0/5, so that y is a divisor of g with g/y prime to m—1. 

We thus see that 7o is the least order of a subgroup of our cyclic group, 
with all of the subgroups of the cyclic group containing the unique subgroup 
of order yo. At one extreme, when yo=1, which is equivalent to g prime to 
m—1, the cyclic group has a subgroup of first order. This corresponds to the 
element of first order previously noted, which is now seen to be unique. Every 
subgroup then contains this first order element, and their orders are the same 
as the orders of the subgroups of an ordinary cyclic group of order g. At the 
other extreme yce=g, which is equivalent to every distinct prime factor of g 
being a iactor of m—1. The cyclic group then has no (proper) subgroup, each 
of its elements being of order g, and thus generating the entire cyclic group(®). 
In particular, if g is a prime p, the corresponding cyclic group is always of one 
’ of these two special types. Note that unlike an ordinary group, a polyadic 
group whose order is a prime p need not be cyclic. By the extended Lagrange 
theorem its elements must be of order 1 or p. If it has an element of order #, 
it must be the cyclic group of order p. However all of its elements may be of 
order one, in which case it is noncyclic. 

In the general case the orders of the subgroups are multiples of yo, the 
multipliers being the orders of the subgroups of an ordinary cyclic group of 
order go. Hence, if - - - - - - with pi, ps, - , p, distinct 
primes not factors of m—1, qi, G2, - * , Ye factors of m—1, then the number of 
subgroups of the cyclic m-group of order g is (a1+1)(a2+1) --- (a, +1)—1. 

We can now also find an expression for the number of elements of given 
order in a cyclic m-group. With that order one for which there is at least one 
element, the total number of elements of that order will be the same as the 
number of generators of a cyclic m-group of that order. We proceed there- 
fore to find the number of generators of a cyclic m-group of order g generated, 
say, by so. We first show that if m—1 is prime to g the number of generators 
is @(g) as for ordinary cyclic groups. In fact, if we recall our formula for the 
order of s"! we see that the number of generators in question is the number 
of numbers (m—1)n+1,=0,1,---,g—1, prime to g. But with m—1 prime 
to g this is the same as the number of numbers 0, 1, -- - , g—1 prime to g, 
that is, ¢(g). Now, in the general case, expand the given cyclic m-group of 
order g in cosets as regards its subgroup of order yo. The resulting quotient 
group is then an m-group of order go prime to m—1. Now let s be any element 
of the given group, o the corresponding element of the quotient group. Then s 


(*8) Whence our correction of a statement of Miller. 


286 E. L. POST ° [September 


is a generator of the given group when and only when g is a generator of the 
quotient group. That o generates the quotient group if s generates the given 
group is immediate. As for the converse, s will then generate a group having a 
complete set of multipliers for our coset expansion. But it must also generate 
all the elements of the subgroup of order 7o, and hence all the elements of the 
group. It follows, on the one hand, that the quotient group is itself cyclic, and 
hence has $(go) generators, and hence, finally, that the number of generators of a 
cyclic m-group of order g is yob(go). 

Among the few extensions of topics of the ordinary theory of cyclic groups 
omitted in the above development is that of the kth powers of elements of a 
cyclic group. We state the result for m-groups without further proof. The dis- 
tinct kth powers of a cyclic m-group of order g constitute a subgroup of order g/h 
where h is the highest common factor of g and (m—1)k+1; furthermore, each 
element of this subgroup is the kth power of exactly h elements of the given group. 

22. Cyclic polyadic groups; polyadic theory. We have observed that a 
cyclic m-group of order g has a first order element when and only when m—1 
is prime to g. As this element, when it exists, is invariant under the group, 
it follows that a cyclic m-group of order g is reducible to a 2-group when and ~ 
only when g is prime to m—1. We turn now to the general discussion of reduci- 
bility for cyclic polyadic groups. Our first result is immediate. Every group to 
which a cyclic group is reducible is cyclic. For if s is a generator of the given 
cyclic group, c its operation, c’ the operation of the reduced group, every ele- 
ment of the given group is given by an extended c operation involving s’s 
only, hence also by an extended c’ operation involving s’s only. s is therefore 
a generator of the reduced group, which is thus cyclic. 

In applying our general criterion of reducibility to cyclic groups, questions 
of commutativity are automatically disposed of, since every cyclic group is 
abelian. A cyclic m-group will then be reducible to a u-group, m=k(w—1)+1, 
if for some (u—1)-ad, which may be written { stl, i simi}, s be- 
ing a generator of the cyclic group, the (m—1)-ad { tml, sim). «+, gl-1), 

simi), sina)... sim} is an identity of the cyclic group. Hence also if 
slk(mitnat-+-+ny +l] =, i.e., if kn+1 is a multiple of g, where g is the order 
of the group, n=m+m.+ --- +n,-1. It follows first that the cyclic m-group 
is reducible to a u-group when and only when k = (m—1)/(u—1) is prime to g. 
Furthermore, with k prime to g, if kn’ +1 and kn’’+1 are both multiples of g, 
then k(n’—n’’), and hence n’—n’’, must be a multiple of g. It follows from ° 
our first law of m-adic powers that the (u—1)-ads corresponding to n’ and n’’ 
are equivalent. Recalling our general theory of reducibility, we thus see that 
a cyclic m-group is reducible to but one yu-group for each admissible yp. 

The least value of uw for which (m—1)/(u—1) is prime to g corresponds 
to a k which is the largest divisor of m—1 prime to g. We thus obtain the 
following result. The real dimension of a cyclic m-group of order g is equal to 
(m—1)/Ro+1 where ko is the largest divisor of m—1 prime to g. In particular 


1940] POLYADIC GROUPS 287 


a cyclic m-group of order g is irreducible when and only when each prime factor 
of m—1 is also a prime factor of g. Our previous uniqueness result easily en- 
ables us to complete the picture as far as mere reducibility is concerned. We 
thus see that the real dimension of a cyclic m-group of order g is its only irre- 
ducible dimension; and the groups to which the cyclic m-group is reducible, all 
cyclic, consist of a single group of dimension equal to the real dimension of the 
given group, and those of its extensions whose dimensions are of the form 
k(uo—1)+1, where po is the real dimension in question, k any proper divisor of 
ko =(m—1)/(uo—1). 

Since the subgroups of a cyclic group are themselves cyclic, we can find 
their real dimensions by applying the above formula. y will be the order of a 
subgroup of a cyclic m-group of order g if it is a divisor of g with g/y prime 
to m—1. Writing g=dy, with d prime to m—1, we see that the largest divisor 
of m—1 prime to y is also the largest divisor of m—1 prime to g. Hence, 
all the subgroups of a cyclic polyadic group have the same real dimension, namely 
the real dimension of the group itself. It follows that a subgroup of a cyclic 
m-group is reducible to a u-group when and only when the given group is 
reducible to a u-group. In particular, all the subgroups of an irreducible cyclic 
group are irreducible. We now readily verify that the following simple situa- 
tion holds. If a cyclic m-group be reduced to a y-group, the subgroups of the 
m-group are thereby reduced to the subgroups of the u-group(*). In fact, half of 
this situation obtains for arbitrary polyadic groups. For from the very defini- 
tion of reducibility, if a polyadic group G is reduced to a polyadic group G’, 
the subgroups of G’ are also subgroups of G, or more exactly, reductions of 
subgroups of G. Moreover, if G is abelian, every reduction of a subgroup of G 
can be effected by thus reducing G to some G’. For the satisfaction of our gen- 
eral criterion of reducibility by the subgroup then holds equally well for G, 
and the same operation that serves to reduce the subgroup is shown by the 
proof of that criterion to reduce G as well. If now G is cyclic and reducible to a 
é-group, every subgroup of G is reducible to a yw-group; and since G can be 
reduced to but a single u-group G’, that reduction must reduce all the sub- 
groups of G to subgroups of G’, and hence by the first part of the proof to the 
subgroups of G’. Our proof incidentally shows that by varying wu every possi- 
ble reduction of a subgroup of G will thus be obtainable. We furthermore have 
the following corollary which, indeed, can easily be proved directly, and itself 
used to give a different turn to our proofs. The polyadic orders of the elements of 
a cyclic polyadic group remain unchanged under every reduction of the group. 

While cyclic groups form a closed set with respect to the two operations 
“subgroup of” and “reduction of,” they do not form a closed set under the 
operation of extension, of which reduction is the inverse, and hence under the 
more general operation of derivation. We proceed to prove that a cyclic 


(*) We prove this result independently of the discussion of complexes which concluded 
§5, since that discussion was extremely sketchy. 


288 E. L. POST ; [September 


m-group of order g remains cyclic when extended to a u-group, w= k(m—1)+1, 
when and only when k is prime to g. Let the polyadic mth powers of an element 
s in the two groups be written more explicitly s'!™ and s!-+, By counting c’s 
we then have immediately 


= glknim, 


Let s be a generator of the extended group, assuming that group to be cyclic. 
The elements of that group, and hence of the given group, will then be given 
by the yw-adic powers of s, and hence by those m-adic powers of s of the form 
stknJm, For each N, therefore, there must be an such that s!¥Jm=slknIlm and 
hence an ” and v such that N=kn+¢gyv. This will be so when and only when k 
is prime to g. 

A group may therefore be reducible to a cyclic group without itself being 
cyclic. It will be convenient to have the phrase “reducible to a cyclic group” 
cover even the irreducible cyclic groups. The class of groups reducible to 
cyclic groups is therefore a wider class than the class of cyclic groups. While 
it is obviously closed under the operation “extension of,” the situation has 
become obscured so far as the operations “reduction of” and “subgroup of” 
are concerned. It turns out that the following discussion of the corresponding 
associated groups clears up the entire situation. 

We first reinterpret m-adic power and m-adic order in terms of the coset 
theorem. More generally, let s be an element of an arbitrary m-group K, 
K*’ an arbitrary containing group of K. Then, in the notation of K*’, the 
m-adic nth power s'! of s is the ordinary power of s, s“™~"+!, The m-adic 
order of s is therefore the least positive integral value of m for which 
s(m—Datl=s j.e., for which s“-»"=1. It follows that the m-adic order g of s 
is identical with the ordinary order of s"—'. As for the ordinary order of s as 
element of K*’, we can offhand merely say that it is a divisor of (m—1)g. 
If, however, K*’ is of index m—1, in particular if it be the abstract containing 
group K* of K, then s¥ =1 is possible only if N is a multiple of m—1, and 
hence the ordinary order of s will be exactly (m—1)g. 

These observations are immediately applicable to our discussion of cyclic 
polyadic groups, and are in turn illuminated thereby. We first observe that 
every containing group G*’ of a cyclic m-group G 1s cyclic. For if s is a generator 
of G, the elements of G being the m-adic powers of s are also ordinary powers 
of s in G*’. Hence the elements of G*’, being products of elements of G, are © 
also ordinary powers of s, and G*’ is an ordinary cyclic group generated by s. 
Note that if G*’ is of index m—1, as is always the case when s is an element 
of K with K*’ of index m—1, then the order of G*’ is m—1 times the order of 
G and thus again the ordinary order of s, m—1 times its m-adic order. 

Since the abstract containing group G* of a cyclic m-group G is cyclic, it 
follows that its subgroup Go, the associated ordinary group of G, is cyclic. 
Indeed, our earlier result to the effect that the m-adic order of s is equal to the 


1940] POLYADIC GROUPS 289 


ordinary order of s”"~! shows that an element s of an m-group G generates G 
when and only when s”~', then an element of Go, generates Go. 

This result can be immediately generalized to the following. The associ- 
ated ordinary group of a group reducible to a cyclic polyadic group is cyclic. For 
the abstract containing group of the cyclic polyadic group is a containing group | 
of the given group. The abstract associated group of the cyclic polyadic group, 
cyclic by the preceding result, is therefore the associated group of the given 
group corresponding to the above containing group. But we have shown in §6 
that all containing groups of a given polyadic group yield simply isomorphic 
associated groups. 

We have seen that every containing group of a cyclic polyadic group is 
cyclic. While the last argument shows that some containing group of a group 
reducible to a cyclic polyadic group is cyclic, it is not true that every con- 
taining group of such a group is cyclic. In fact it is readily proved that if the 
abstract containing group of a polyadic group is cyclic, the polyadic group 
itself must be cyclic. Hence, while cyclic polyadic groups are characterized 
by the fact that their abstract containing groups are cyclic, we must seek 
elsewhere for a similarly definite characterization of groups reducible to cyclic 
polyadic groups. 

This characterization cannot consist merely of the associated ordinary 
group of a polyadic group being cyclic; for the abelianism of cyclic polyadic 
groups makes every group reducible to a cyclic polyadic group abelian, while 
non-abelian polyadic groups exist whose associated ordinary groups are 
cyclic. The added hypothesis of abelianism is however sufficient. We pro- 
ceed to prove the following result which will enable us to close the entire 
polyadic development of cyclic groups. Every abelian polyadic group with 
cyclic associated ordinary group is reducible to a cyclic polyadic group. Since 
the commutativity of two elements can be tested by any extended operation, 
it follows that an abelian group can be reducible only to an abelian group. 
Coupled with the previous observations on containing and associated groups, 
it follows that if an abelian group with cyclic associated group is reducible 
to a second group, the latter is also an abelian group with cyclic associated 
group. Our result will therefore have been proved if we show that every irre- 
ducible polyadic group of this type is in fact cyclic. 

Let then G be an irreducible abelian m-adic group of order g with cyclic 
associated group Go. With so a fixed element of G, ¢ a generator of Go, the g 
elements of G may be written sot”, n=0, 1, 2, - - - , g—1, in accordance with 
the coset theorem. The (m—1)-ad s}~' will itself be in Go. Let then s9~'=t*. 
Since G is abelian, its reducibility to a u-group, with m—1=k(u—1), would 
be equivalent to the existence of a (u—1)-ad { sot‘, sot'2, - - - , Sot#-1} such 
that (Sot sot*? - - - sot ‘v-1)* =1, i.e., such that 

Rin + + ipa) (mod g)("), 


(7) G being abelian, so and ¢ are commutative. 


290 E. L. POST [September 


and hence to the H.C.F.(R, g)’s being a divisor of x. It follows that the 
irreducibility of G is equivalent to the combined condition, each prime di- 
visor of m—1 is a divisor of g, x is prime to m—1. On the other hand, we 
have seen that an element s of G generates G when and only when s”~! gen- 
erates Go. With s=sof’, s™—!=t*+(™—”, Since, for our irreducible G, x is prime 
to m—1, the arithmetic progression 


k + (m — 1)», vy=0,1,2,---, 


will certainly include a value which is prime to g. With v thus chosen, 
tx+(m—D» i.e., s™—!, is a generator of the cyclic Go, and hence s of the conse- 
quently cyclic G. 

We thus see that the class of polyadic groups reducible to cyclic polyadic 
groups is identical with the class of abelian polyadic groups with cyclic asso- 
ciated groups. The first formulation immediately showed this class of groups 
to be closed under the operation “extension of.” The second formulation was 
already used to prove it closed under the operation “reduction of.” It also 
easily shows the class to be closed under the operation “subgroup of.” For 
such a subgroup must be abelian; while its associated group, being a subgroup 
of the associated group of the parent group, must be cyclic. Hence the class 
of polyadic groups reducible to cyclic polyadic groups is closed under the three 
operations “reduction of,” “extension of,” and “subgroup of.” 

In particular, the net of groups derivable from a cyclic polyadic group, 
or for that matter from a group reducible to a cyclic polyadic group, consists 
wholly of groups reducible to cyclic polyadic groups. The irreducible groups 
of the net are therefore all cyclic. Since we are dealing with abelian groups of 
finite order, the outer real dimension of these groups is 2. Hence the net of 
groups is in fact also derivable from an ordinary cyclic group. We proceed 
then to study the net of groups derivable from a cyclic 2-group of order g. 
Our general theory shows that for each m2=2 there will be g m-groups in the 
net, one for each class of equivalent (m—1)-ads of the 2-group, said class 
serving as the class of identities of the m-group. In terms of the given 2-group, 
equivalent polyads are equivalent to a unique element of the group. Hence 
the groups of the net are determined in 1-1 fashion by letting m run through 
the values 2, 3, 4,---, and s, the element of the 2-group equivalent to their 
identities, run through the g elements of that 2-group. By utilizing the ex- 
pression for the operation of a polyadic group in terms of the operation of a . 
group it is reducible to, and the fact that for any two groups of a net there is a 
third reducible to each, we find the following expression, in terms of the opera- 
tion of the 2-group, for the operation c of an m-group of the net with identities 
equivalent to s: 


By means of this formula we easily find which groups of the net are cyclic, 


i 
. 


1940] POLYADIC GROUPS 291 


and hence also which are the irreducible groups of the net. While it also en- 
ables us to study in detail the relation of reducibility for the groups of the net, 
the resulting picture is quite complicated, and will not be entered into here. 

Let so be a generator of the cyclic 2-group, and let s=s}. If then sj be 
any element of the m-group of the net with identities equivalent to s, the 
above operation yields the following expression for the corresponding m-adic 
nth power of 59: 


— 


We then easily find the condition under which, for some », sj is a generator 
of the m-group, and thus obtain the following result. If so is a generator of an 
ordinary cyclic group of order g, the cyclic groups of the net of groups derivable 
from the given group are those m-groups whose identities are equivalent to an s} 
for which H.C.F.(m—1, X, g)=1. If y is the order of s} in the 2-group, this 
condition is equivalent to g/y prime to m—1. Thus all of the g m-groups of 
the net for given m are cyclic when and only when m—1 is prime to g. Since 
the irreducible groups of the net are the irreducible cyclic groups of the net, 
we see that the irreducible groups of the net are those for which the prime di- 
visors of m—1 are all divisors of g while \ is prime to m—1. Hence, for g=2, 
a cyclic polyadic group of order g has an infinite number of outer irreducible 
dimensions. 

The full force of our closedness results for groups reducible to cyclic poly- 
adic groups is brought out by the complexes obtained from such groups. We 
have then that the complex of groups obtainable from a cyclic polyadic group, 
or, in general, from a group reducible to a cyclic polyadic group, consists 
wholly of groups reducible to cyclic polyadic groups. We recall that the groups 
of any complex separate into mutually exclusive nets, there being a 1-1 cor- 
respondence between these nets and the different classes of elements the 
groups of the complex can have. In the present instance each net is of the 
type discussed above, being derivable from a group reducible to a cyclic poly- 
adic group. Furthermore, these “group-bearing” ciasses now admit of very 
simple description. As most of the resulting picture holds good for arbitrary 
finite abelian polyadic groups we so present our development. 

Observe first that our simplification of the operations yielding an arbi- 
trary complex shows that its group-bearing classes, apart from that of the 
initial group, can all be obtained from the subgroups of the extensions of the 
initial group. Since a finite abelian polyadic group is always derivable from 
a 2-group, we may then assume that initial group to be a 2-group. That 
2-group is then its own associated and containing group, and can be identified 
with the associated and containing group of each of its extensions. The rela- 
tionship between the subgroups of a polyadic group and of its associated 
group, actually valid for an arbitrary containing group, then yields the fol- 
lowing result. The group-bearing classes of a complex obtained from a finite 


292 E. L. POST : [September 


abelian polyadic group are, apart from the class of elements of the given group, 
the classes of elements of the subgroups of any 2-group derivable from the given 
group and the cosets of those subgroups. 

We recall that the problem of the intersection of two subcomplexes of a 
complex was reduced to that of the intersection of their corresponding group- 
bearing classes. The above result then shows that, for finite abelian groups, 
either two group-bearing classes have no elements in common, or their com- 
mon elements constitute an augmented coset of the crosscut of the subgroups 
of the 2-group of which they are augmented cosets. Note actually that the 
2-groups derivable from the given finite abelian group are in 1-1 correspond- 
ence with the elements of the group, the element corresponding to a 2-group 
being the identity of the 2-group. If then s be any element of the given group, 
the group-bearing classes containing s constitute the subgroups of the 2-group 
having s as identity. It follows that if two group-bearing classes of the complex 
obtained from a finite abelian group have a common element, they are the classes 
of elements of two subgroups of one and the same 2-group derivable from the given 
group, and intersect accordingly. 

In particular, then, for a cyclic polyadic group of order g the group-bear- 
ing classes of its complex are g/y in number, of y elements each, for every 
divisor y of g. And two group-bearing classes, with 7: and y2 elements respec- 
tively, either have no elements in common, or exactly H.C.F.(y1, ye) ele- 
ments in common. 

The above development can be given a somewhat different turn. For any 
finite polyadic group a finite number pf extensions of the group, and sub- 
groups of those extensions, suffice to yield all group-bearing classes, as these 
are now finite in number. From the corresponding situation for a pair of 
groups of a net it follows that for any finite number of groups of a net there is 
a group of the net itself reducible to each of the given groups. Hence the above 
extensions can themselves be extended to one and the same group. In this 
process the subgroups of these groups are extended to subgroups of the result- 
ing group. Hence, the group-bearing classes of the complex obtained from a finite 
polyadic group are the classes of elements of a single suitable extension of the 
group, and of the subgroups of that extension. For any finite polyadic group, 
therefore, the intersection of two group-bearing classes can be pictured as the 
intersection of two subgroups of one and the same extension of that group. 
And now for the earlier picture. Clearly any element of finite order in a poly- - 
adic group is of first order in some extension of that group, and hence is the 
sole member of a group-bearing class of the complex obtained from that 
group. It follows that the elements of the above “suitable extension” of a 
finite polyadic group are all of first order. Hence that extension will itself be 
reducible to each of the 2-groups derivable from the given group. If, further- 
more, the given group is abelian, each of its elements s will be the identity of 
a 2-group to which that extension is reducible, and the subgroups of that ex- 


1940) POLYADIC GROUPS 293 


tension containing s will thereby be reduced to the subgroups of the 2-group— 
hence that first picture. 

In conclusion, then, while the theory of cyclic groups requires for its com- 
pletion the introduction of groups reducible to cyclic polyadic groups, the 
theory of these groups is entirely self-contained. While it would therefore be 
desirable to complete this theory by developing the properties of these groups, 
and we have at hand the instruments that would yield this development, we 
have perhaps already spent too much time on such very special develop- 
ments, and so pass on to the more general topics of the theory. 

23. Abstract polyadic groups of the first three orders. The concepts of the 
last two sections give a certain basis for distinguishing between polyadic 
groups. As in ordinary theory, in counting abstract polyadic groups no dis- 
tinction will be made between groups that are simply isomorphic. By con- 
trast, in the theory of reducibility such a distinction is imperative, for two 
groups on the same class of elements, but with different multiplication tables, 
must there be considered different even if simply isomorphic. Our present 
interest lies not only in the results to be obtained but in the illustrations of 
method thus afforded. 

For each m2 2 there is of course the single abstract m-group of order one. 
Its sole element is of the first order, and hence the group is cyclic, and reduc- 
ible to the cyclic 2-group whose sole element is the identity. 

The abstract m-groups of order two can be determined directly from their 
possible multiplication tables("'). If they are written on the abstract elements 
a and 8, and c represents the m-adic operation, the value of c(aa--- a), 
that is, of a], determines the table; for each change in the value of an argu- 
ment must change the value of the result. Hence there are at most two ab- 
stract m-groups of order two. It further follows that a!) is, or is not, equal 
to 8" according as m is even or odd. If m is even, then if a=a, BUl=a, 
while if and the two possible groups are changed into each 
other on interchanging a and 6. On the other hand, if m is odd, if a! =a, 
BUul=86, and if aM=6, B! =a, and the two groups cannot be simply iso- 
morphic. These groups are then readily identified to yield the following re- 
sult. When m is even, there is but one abstract m-group of order two, namely, 
the cyclic m-group of order two. It then consists of one first order element and 
one second order element, and is reducible to the ordinary cyclic group of 
order two, if it be not that group. When m is odd there are exactly two ab- 
stract m-groups of order two; one group consisting of two first order elements, 
and being the non-cyclic second order m-group reducible to the ordinary 
cyclic group of order two, the other group being the cyclic m-group of order 
two, consisting of two second order elements, and hence not reducible to a 
2-group. 

(“) Dérnte used this method to determine the number of m-groups on two symbols as 
elements, but did not consider the question of those m-groups being abstractly the same. 


294 E. L. POST [September 


To obtain the abstract m-groups G of order three, we employ the general 
coset theorem method of §8. The associated ordinary group Gop must be cyclic, 
and hence its elements may be written 1, ¢, ¢. If so be a fixed element of G 
with to in Go, we may assume that either (1) =1, (2) to=t; for were 
to =¢*, groups simply isomorphic with those of case (2) would result. Go, fur- 
thermore, admits of but two automorphisms, i.e., (a) the identical auto- 
morphism, (b) the automorphism interchanging ¢ and ¢* while, of course, 
leaving 1 invariant. With either of these automorphisms as the automorphism 
of Go under so, and either of the two choices of to, an m-group will be corre- 
spondingly determined provided (A) the automorphism carries fo into itself, 
(B) the (m—1)-st power of the automorphism is the automorphism of G» un- 
der to. Of the four cases thus to be considered (1) (a) and (2) (a) satisfy both 
(A) and (B) for all m’s, and hence always determine a corresponding m-group. 
(1) (b) satisfies (A) for all m’s, but (B) only for m odd; for if m be even, the 
(m—1)-st power of the automorphism interchanges ¢ and ¢? whereas tp=1 
leaves them unchanged. Hence (1) (b) determines an m-group when and only 
when m is odd. Finally, there is no polyadic group of order three correspond- 
ing to (2) (b), as (A) is then never satisfied. 

We now identify and distinguish between the groups thus determined. 
The group (1) (a) is abelian since s and ¢ are then commutative. Since Go is 
cyclic, G is therefore cyclic, or reducible to a cyclic group. Direct calculation 
then shows that if m—1 is a multiple of 3, each element is of first order, and 
hence the group is noncyclic, but reducible to the ordinary cyclic group of 
order three. On the other hand, when m—1 is not a multiple of 3 we find 
that while so is of first order, ¢so, and in fact #259, are not, and hence must be 
of the third order. The group is therefore cyclic, but reducible to the ordinary 
cyclic group. 

In the case of the group (2) (a), so, not being of the first order, must be 
of the third order. The group is therefore cyclic. When m—1 is not a multiple 
of 3 it is therefore simply isomorphic with the group (1) (a). On the other 
hand, when m—1 is a multiple of 3, and hence not prime to g=3, the group 
contains no first order element. It is therefore not reducible to an ordinary 
group, and consists of three third order elements. 

Finally group (1) (b), m odd, is non-abelian, since so does not leave ¢ in- 
variant. Being therefore noncyclic, each of its elements is of the first order. 
We have already given this group with m=3 as an example of one with no . 
invariant element. This property holds for each admissible m. In fact, since 
any two of the three elements must generate the whole non-abelian group, 
each element is invariant under no other element than itself. It follows that 
each element transforms a second element into the third, a property which 
by itself can be shown to determine the multiplication table of that third 
order m-group for odd m. It is needless to add that this group is not reducible 
to an ordinary group. 


d 
i 


1940] POLYADIC GROUPS 295 


The third order abstract polyadic groups may then be tabulated as fol- 
lows, the numbers in the parentheses being the orders of the elements. 


u=0, 1, 2,°°° 


m—-1= | 64u+2 | 6u+3 | 6u+4 | 6ut+5 | 6u+6 


cyclic (3, 3, 1) 1 1 1 1 
cyclic (3, 3, 3) 
abelian (1, 1, 1) 
non-abelian (1, 1, 1) 


total 


In particular, the one ordinary third order group comes under the case m—1 
=6u+1 with »=0. We further see that the smallest value of m for which 
there are three abstract third order groups is 7(7?). 

24. Properties of transforms. The coset theorem enabled us to write the 
transform of an element s by a polyad r in the ordinary form r—'sr. A funda- 
mental m-group is of course tacitly presupposed. Since the m-adic mth power 
of an element can likewise be written as an ordinary (m—1)n+1 power, it 
follows that 


Hence, also, the m-adic order of an element is unchanged under transforma- 
tion. 

Suppose now that r—'sr =s!¢], By raising both sides of this equation to the 
m-adic Bth power we then have 


= (s®}) 


for our m-adic formula for the power of a power shows that (s!@!) (6) = (561) [¢], 
Hence we have the following generalization of the corresponding ordinary 
theorem. If a polyad transforms a generator of a cyclic m-group into its ath power, 
it transforms every element of this cyclic group into its ath power. 
Commutativity is related to transform through invariance. Given two 
noncommutative elements s» and s, we consider what m-adic powers, if any, 
of s are commutative with So. If s is of m-adic order k, its ordinary order in the 
fundamental abstract containing group is (m—1)k. Let yo be the least posi- 
tive value of y for which the ordinary power s’ is commutative with So. Yo is 
then a divisor of (m—1)k and the distinct ordinary powers of s commutative 
with so are s"%, m=1, 2,---, (m—1)k/yo. The m-adic powers s®! commuta- 


(7) The two third order m-groups falling under the case m—1=6y+2 have been given by 
Miller for 


1 

1 

1 

7 1 2 2 2 1 3 


296 E. L. POST ; [September 


tive with sq are those for which (m—1)8+1 is a multiple of yo. It follows first 
that there will be an m-adic power of s commutative with so when and only 
when ¥o is prime to m—1. Yo is then a divisor of k; and if Bo is the least value 
of B for which s! is commutative with so, the m-adic powers of s commutative 
with so are , (R/yo—1). Actually these k/yo m-adic pow- 
ers of s commutative with s) must constitute a subgroup, necessarily cyclic, 
of the cyclic m-group generated by s. They are therefore the m-adic powers 
of some one of their number, not, however, necessarily of s{0l(74), 
If we form the successive transforms 


S1s9s = = , = °°, 


the resulting elements are the transforms of so under the various ordinary 
powers of s. In general, therefore, they will not all be gotten by transforming 
So by the elements of the cyclic m-group generated by s, but by the elements 
of the abstract containing group of that cyclic m-group, or, what is the same 
thing, by the various polyads of the cyclic m-group. 

This suggests that given any m-group G and element 5S» we consider the 
transforms of so under the various polyads of G. These will then constitute 
a complete set of conjugates of so under the abstract containing group G* of G. 
The following discussion applies equally well to an m-group K taking the 
place of the element 5». 

With s an element.in G, Go the associated ordinary group of G, we have 
the expansion G* =Gos+Gos?+ - - - +Gos"-?+Go, with Gos =G. Since Go is 
an ordinary group, the number of transforms of s under the elements of Go 
is some divisor v of g, the common order of G and Gp. Each coset Gos‘ there- 
fore transforms So into v distinct elements. If two cosets yield a common 
transform of so, by writing those cosets in the form 1:Go, 72Go, 71 and 12 being 
elements of the cosets yielding that common transform, we see that the set of 
transforms yielded by one coset is identical with the set yielded by the other. 
The transforms of so under G* thus fall into a certain number « of mutually 
exclusive classes of v elements each. By a method entirely analogous to that 
used in the analysis of an arbitrary containing group, we easily find that x 
is a divisor of m—1, and that the first x cosets all yield distinct sets of v 
transforms each, these being repeated in order by each succeeding set of x 
cosets. We thus have the following theorem. The number of transforms of an 
element under the polyads of an m-group of order g is of the form xv, where v © 
is a divisor of g, x a divisor of m—1. For each i the i-ads of the group yield v 
distinct transforms. The xv transforms can be obtained from the i-ads with 
4=1, 2,---, x; and these x mutually exclusive sets of v transforms each are 
cyclically repeated for i-ads with i> k. 

We can now connect the theory of transforms with that of groups of sub- 


(78) As may be shown by an example. 


| 


1940] POLYADIC GROUPS 297 


stitutions. For convenience set and let T2, - - - , be the mu- 
tually exclusive sets of v transforms each corresponding toi=1,2,---,u—1 
respectively. If s; is any element of G, and s’ is the transform of so by an 
i-ad of G, s7's’s; will be the transform of so by an (i+1)-ad of G. It follows 
that s; transforms the members of each I; in 1-1 fashion into the mem- 
bers of I';4:. Thus each element of G determines a y-adic substitution on 
Ti, Ts, ---,T,-1. Clearly, the product of m elements of G yields a u-adic sub- 
stitution which is the product of the u-adic substitutions yielded by those 
elements. Certainly then, for our finite G the class of all u-adic substitutions 
corresponding to elements of G constitutes an m-adic group of yu-adic sub- 
stitutions isomorphic with G. It is readily seen that if N elements of G corre- 
spond to one p-adic substitution, exactly N elements of G correspond to each 
p-adic substitution, and the isomorphism is (1, NV). Finally, this (m, u) sub- 
stitution group is transitive. For if element s’ of I’; is the transform of s» by 
the i-ad {5;,, Si, °°, 5;,} of G, the transforms of s’ by the elements s; of G 
are the transforms of so by the (+1)-ads {Sins Si * Sj} Of G, hence by 
all (+1)-ads of G, and so constitute the whole class I';,;. 

When x=1, the (m, u) substitution group becomes a transitive m-group 
of ordinary substitutions. The transforms of so under the elements of G, now 
identical with the transforms of so under the polyads of G, then include so, 
and are such that each is transformed into the entire set by the elements of G. 
On the other hand, when «x >1, the transforms of so under the elements of G 
are transformed by the elements of G into an entirely different set. Nor can 
they then include so; for so, being transformed into itself by that (m—1)-ad 
of G which is the identity of Go, appears for the first time in the set of trans- 
forms for which 1 =x. We thus see that the transforms of sp under the elements 
of G must be said to constitute a complete set of conjugates of so under G 
when and only when x =1. And the fact that then and only then is so included 
in that set of transforms needs only restatement to become the following 
useful criterion. The necessary and sufficient condition that the transforms of an 
element So by the elements of an m-group G constitute a complete set of conjugates 
under G is that so is commutative with some element of G. As in the case of ordi- 
nary groups, the elements of G thus leaving so invariant constitute a subgroup 
H of G. If G is expanded in right cosets as regards H, each coset consists of all 
the elements of G transforming so into some one element. Hence, here too 
the number of conjugates of so under G is the order of G divided by the order 
of the largest subgroup of G leaving so invariant. 

If so is actually an element of G, the above condition is automatically satis- 
fied with so itself as element commutative with so. We thus have the signifi- 
cant fact that the transforms of an element of a polyadic group under the 
elements of the group always constitute a complete set of conjugates under 
the group(™). Hence, as for ordinary groups, all the elements of an m-group G 


(™) Essentially a result of Miller’s when stated for “perfect cosets.” 


298 E. L. POST , [September 


can be separated into distinct complete sets of conjugates as regards G, and this 
separation can be performed in only one manner. 

In the case of an i-ad of G with 1>1 the transforms of the t-ad by the ele- 
ments of G need no longer constitute a complete set of conjugates. Thus in 
the non-abelian 3-group of order three a dyad not the identity has but one 
transform under the elements of the group, two under the polyads of the 
group. However, our general theorem holds in this case; and since the 7-ad is 
invariant under itself, it readily follows that x is a divisor of H.C.F.(1, m—1). 

25. Generation of polyadic groups by two groups, one invariant under the 
elements of the other. We shall consider two distinct cases. In the first, a 
2-group H, is invariant under each element of an m-group K, in the second, 
an m-group H is invariant under each element of an m-group K. The discus- 
sion of the m-group G generated by the two given abstract groups can also 
be carried through from two different points of view, the first, that of the 
investigation of properties of groups assumed given, the second, that of the 
construction of groups hitherto unknown. 

We have already illustrated the constructional point of view in §8. Our 
present interests being largely theoretical, we shall not further pursue the 
complexities introduced by that point of view in the field of abstract group 
theory, but merely obtain the results given by the first point of view(”). 


(*) This is the point of view really followed by Miller in §25, Finite Groups, despite the 
section heading “Construction of Groups with Invariant Subgroups.” He thus obtains the 
theorem: “If all the elements of a group H transform G into itself, then H and G generate a 
group whose order is the order of G multiplied by the index under H of the crosscut of G and 
H.” The constructional point of view, while using his treatment for purposes of analysis, would 
necessitate the following complications. H and G would be given by group-satisfying multiplica- 
tion tables on specified symbols as elements. These tables must then satisfy the consistency 
condition that H and G have at least one element in common, and that the product of two 
elements common to H and G is the same in H as in G. With each element of H there would 
be given a corresponding automorphism of G which is to be the automorphism of G induced by 
transforming it by that element of H. These automorphisms must then satisfy the consistency 
conditions that the product of the automorphisms corresponding to two elements of H is the 
automorphism corresponding to the product of those elements, while the automorphism cor- 
responding to any element of H common to H and G is the automorphism of G induced by that 
element as element of G. That posited, our guess is that H and G, assumed finite, will generate 
a unique group in the sense that there exists a group K which, with respect to itself as funda- 
mental group, is the group generated by H and G, while all such groups are simply isomorphic; 
a simple isomorphism being in fact determined by letting each element of H and G correspond - 
to itself. 

The above criticism assumes that we are dealing with abstract groups, the title of the 
chapter in which the above section appears. If the generating groups be given as substitution 
groups, for example, the divergence between the two points of view disappears, as there is al- 
ways the symmetric group on all the letters involved to act as fundamental group. A similar 
situation obtains for m-adic groups of ordinary substitutions as is shown in the last footnote of 
our present section. 

It should be pointed out that what we have termed the constructional point of view is 
followed in the related theory of group extensions. (See the first footnote to §8.) That it is but 


1940] POLYADIC GROUPS 299 


This point of view assumes a given m-group F. In the first of our two cases, 
Hi, is a subgroup of the associated ordinary group Fo of F which is invariant 
under each element of a subgroup K of F. It is convenient here to consider F 
a subgroup of itself. The crosscut of all subgroups of F which are such that Ho 
is a subgroup of their associated groups, K of themselves, is itself one of these 
subgroups, and will be said to be the m-group G generated by Hy and K. We 
may then also say that the m-group G generated by H) and K is the smallest 
subgroup G of F such that Hp is a subgroup of Go, K of G. Similarly, if H and 
K are two subgroups of F with H invariant under each element of K, the 
m-group G generated by H and K is the smallest subgroup G of F such that 
H and K are subgroups of G. The existence and uniqueness of G is thus as- 
sured, but is entirely relative to the given m-group F. 

We shall first consider the subcase of the general Ho, K case where K is 
the cyclic m-group generated by an element s of F. The m-group generated 
by Hy and K may then also be said to be generated by Hy and s. The invari- 
ance condition now reduces to Ho being transformed into itself by s. Consider 
the cosets Hos, Hos™!, Hys®!, - - - . If y is the m-adic order of s, Hos = Hos'7!. 
Let then x be the smallest positive integer for which Hos = Hos"*!. It then 
easily follows that the cosets Hos, Hos"!, - - - , Hos'*-"! are mutually exclu- 
sive, while succeeding cosets are cyclic reproductions of these. Hence, also, 
k is a divisor of y. 

We now readily show that the m-group G generated by Hy and s is given 
by 


G = Hos + + Aost 


Since Hy is a subgroup of Fo, s an element of F, the set G thus defined is con- 
tained in F. Furthermore, the invariance of Hy under s coupled with the above 
coset analysis shows that the product of m elements of G is in G. Hence G is, 
indeed, a subgroup of F. G has s for element, in fact, as a member of Ags. 
Hence Gy=Gs~' has Hp as subgroup. Finally, as with F, every subgroup of F 
whose associated group has Hy as subgroup, while it has s as element, con- 
tains G. Hence G is the m-group generated by Hy and s. We thus have the 
theorem: If s is an element of an m-group, Hy a subgroup of the associated group 
of that m-group invariant under s, then if s\*| is the smallest positive m-adic 
power of s which is in the coset Hos, Hy and s generate an m-group whose order is x 
times the order of Ho. 

In the general Ho, K case let Lo be the crosscut of Hp and Ko. Since Ho 
and K» are invariant under each element of K, the same is true of Lo. Expand 
K in cosets as regards Lo, and let 51, s2, - - - , Ss be a corresponding set of multi- 


a related theory may be seen from the definition of an extension K of G by H asa group having 
G as invariant subgroup, with the quotient group K/G simply isomorphic to H. The above 
complication arising from the common elements of H and G goes not then arise. 


300 E. L. POST . [September 


pliers. We then show that the m-group G generated by Hy and K has the ex- 
pansion 


G = Hos: + Hosa + +++ + Hos, 


with all indicated elements distinct. As a consequence of the invariance of Ho 
under each s; we can reduce the product of m elements of the set G thus de- 
fined to the form ts, with ¢ in Ho, sin K. As s can further be written ¢’s; with t’ 
in Lo, and hence in Hp, s; one of the above x multipliers, we see that the prod- 
uct in question is in G. It then follows as in the special case that G is the 
m-group generated by Hy, and K. Moreover, suppose that with ¢; and # in Ho 
we have Then Since the left side of this equation 
represents an element of Ho, the right of Ko, this one element 7 is in Lo. But 
then s;,=75s;,, contradicting the assumption that s;, se, Ss, were the set 
of multipliers in question. The indicated elements of G are thus distinct, and 
we have the theorem: If K is a subgroup of an m-group, Hy a subgroup of the 
associated group of the m-group invariant under each element of K, then Hy and 
K generated an m-group whose order is the order of Hy multiplied by the index 
under K of the crosscut of Hy and the associated ordinary group Ko of K. 

We turn now to the more interesting case of the m-group G generated by 
two m-groups H and K, with H invariant under each element of K. Note 
that Ho, the associated ordinary group of H, is then also invariant under K. 
It is readily seen that while the m-group generated by Hy and K is contained 
in the m-group generated by H and K, it will be identical with that m-group 
when and only when it contains an element of H. This means that for some ¢ 
in Ho, sin K, s’ in H, ts=s’. But this is equivalent to s =¢~'s’, i.e., to s’s also 
being in H. Hence the m-group generated by H and K is identical with the m- 
group generated by Hy and K when and only when H and K have a common ele- 
ment. 

In particular, if H and K have but one common element, while each ele- 
ment of H is commutative with each element of K, we shall say that the 
m-group G generated by H and K is their direct product. G, then, is also the 
m-group generated by Hy and K, and by Ky and H, and correspondingly has 
expansions which may be briefly written G= Hy) X K = Ky XH. For Lp now re- 
duces to the identity, so that in the first case, for example, the multipliers s; 
are all the elements of K. More symmetrically then, G=(HyXKo)s, with s, 
say, the unique common element of H and K. It follows that Go=HyXKo - 
is the ordinary direct product of Hy and Ko. Clearly s must be of first order, 
since all of its m-adic powers must be common elements of H and K; and be- 
ing invariant under H and K, it must also be invariant under G. All three 
groups are therefore reducible to ordinary groups, and simultaneously so. 
These considerations immediately extend to the “direct product” of any num- 
ber of m-groups provided the unique common element is also the only element 
common to each group and the group generated by the remaining groups. 


1940) POLYADIC GROUPS 301 


Special as this concept of direct product thus turns out to be, it is very useful 
in the theory of abstract polyadic groups. By contrast, the direct product 
method as applied to m-adic substitution groups, while involving no restric- 
tion on the groups per se, did not yield the m-group generated by the given 
m-groups, and hence is restricted in its usefulness to the construction of de- 
sired m-groups. 

In the most general H, K case consider the abstract containing groups of 
the m-groups involved. Since H, K, and G are subgroups of the fundamental 
m-group F, their abstract containing groups H*, K*, and G* may be consid- 
ered subgroups of the abstract containing group F* of F. As the elements of 
an m-group generate the corresponding containing group, it easily follows 
that G* is the ordinary group generated by H* and K*. Since H* will be in- 
variant under each element of K*, the standard theorem tells us that the 
order of G* is equal to the order of H* multiplied by the index under K* of the 
crosscut of H* and K*. But the order of the abstract containing group of an 
m-group is m—1 times the order of the m-group. Hence the order of G is 
equal to the order of H multiplied by that index. 

It is easy indeed to write the actual expansion of G. According to the 
standard theory, if we expand K* in cosets as regards the crosscut of H* 
and K*, and let 7, re, --- , fn be the corresponding set of multipliers, then 
G* =H*n+H*rn+ --- +H*r, with all indicated elements distinct. If then 
r; is an t;-ad of K, the elements of H*r; in G will be the (m—i;)-ads of H 


multiplied by r;. We may therefore write, in notation thus suggested, 
G= (A) + (H) + 


Returning to the order of G, we seek a useful expression for the index in 
question. The crosscut L of H* and K* will consist of the common i-ads of H 
and K for i=1, 2,---,m—1. For i=m—1, these common i-ads constitute 
the crosscut Ly of Hy and Ko. Let / be the order of Lo, x the smallest value of 7 
for which H and K have a common i-ad. Then, by methods already made 
familiar, we find that x is a divisor of m—1, while L consists of / i-ads for 
each of the (m—1)/ki’s, 2x, - - ,m—1. The order of L is thus /(m—1)/kx. 
If now k is the order of Ko, the order of K* is (m—1)k. Hence the index under 
K* of L is xk/l. But k/l is the index under Ko of Lo. Hence the index of L 
under K* is x times the index of Ly under Ko. We therefore have the following 
theorem. If H and K are two subgroups of an m-group such that all the elements 
of K transform H into itself, then H and K generate an m-group whose order is 
the order of H multiplied by the index under Ky of the crosscut of Hy and Ko 
multiplied by a divisor x of m—1, where x is the smallest value of i for which H 
and K have a common i-ad. 

Of special interest is the case where K is the cyclic m-group generated by 
an element s which transforms H into itself. K* is then an ordinary cyclic 
group also generated by s. Hence if s* is the smallest positive ordinary power 


302 E. L. POST . [September 


of s in H*, d will be the index under K* of the crosscut of H* and K*. We thus 
have as our first result: If an element s of an m-group transforms a subgroup H 
of that m-group into itself, and if s* is the smallest positive ordinary power of s 
in the containing group H* of H, then s and H generate an m-group G whose 
order is ) times the order of H. Indeed, the expansion of G is now readily seen 
to be 


G = (BH) + (H)m—28? + + (A) 


Note that if k is the m-adic order of s, and hence k(m—1) its ordinary order, 
X is a divisor of k(m—1). Since the order of G must exceed the order of H 
whenever s is not in H, we obtain the following useful corollary further gen- 
eralized below. If s is of m-adic order one, and not in H, then the order of G is 
equal to the order of H multiplied by a divisor, not unity, of m—1. 

More refined results are yielded by our earlier analysis of the above men- 
tioned index. The associated group K» of the cyclic m-group K generated by s 
will be the cyclic ordinary group generated by s”~!. Hence, if s’“~-” is the 
smallest positive power of s"~! in Ho, v will be the index under Ko of the cross- 
cut of Hy and Ko. Consequently, the order of G is also equal to the order of H 
times v times x, where s’“—» is the smallest positive power of s™—' in Ho, x the 
smallest value of i for which H and the cyclic m-group generated by s have a com- 
mon i-ad. 

We may note certain relationships between the constants thus involved. 
The connecting link between our two expressions for the order of G is the 
equation \= px. \ is thus determined by v and x. Conversely v and x are de- 
termined by \ and m. For the common elements of H* and K* are 
s*, It therefore easily follows that x=H.C.F.(m—1, d), and 
hence v=\/H.C.F.(m—1, \). By means of m-adic groups of ordinary sub- 
stitutions it is readily shown that \ and m may assume arbitrary values. In 
the case of x, v, and m, we have already observed that « is a divisor of m—1. 
Our expressions for x and v in terms of \ and m further show that (m—1)/x 
is prime to v. Now it is readily verified that if x, v, and m are arbitrarily chosen 
subject to these two conditions, then \ = v« redetermines the same « and v by 
means of the above formulas. It follows that x, v, and m may assume any 
values subject to these conditions. If we now further introduce the m-adic 
orders h and k of H and s, we obtain the further conditions v a divisor of k, 
hy a multiple of &; the first from the index interpretation of v, the second from ° 
the order requirement imposed by s’“"—’s being in Ho. We have not carried 
the investigation far enough, however, to see whether the resulting four neces- 
sary conditions on h, k, x, vy and m suffice to insure a corresponding H and s("*). 


(7*) When h=k, the fourth condition is automatically satisfied. In this case the writer has 
verified by an example that h=k, x, v, and m may have arbitrary values subject to the first 
three conditions. 

In constructing such examples by means of m-adic groups of ordinary substitutions, we 


1940} POLYADIC GROUPS 303 


H is clearly an invariant subgroup of the generated group G. If then ¢ 
is the element of the m-adic quotient group G/H corresponding to s, v is seen 
to be the m-adic order of o. For the least positive v with s’“—-» in Hp is the 
least positive v with s’! in Hos, and hence the least positive v with ¢”!=g. 
By a simple result of our later §29 the m-adic order of ¢ is a divisor of the 
m-adic order of s. Our previous corollary thus generalizes to the following. 
The order of G is equal to the order of H multiplied by a multiple of a divisor 
other than unity of the m-adic order of s whenever the m-adic order of the element 
of G/H corresponding to s is not unity; when the latter order is unity, and yet s 


are naturally led to the ordinary groups they generate as containing groups. On the other hand, 
our theory concerns their abstract containing groups only. In the A, m example referred to 
above, it was possible to avoid this difficulty by so choosing H, K, and the fundamental F 
that their concrete containing groups were all of index m—1, and so simply isomorphic with 
their abstract containing groups. On the other hand, especially in the case of F, it is desirable 
to dispense with this requirement. For we could then fully make use of the fact that as for 
ordinary substitution groups, so for m-adic substitution groups, a fundamental F is always at 
hand, namely, the extension to an m-group of the ordinary symmetric group on all the letters 
involved; and clearly all fundamental F’s which are m-adic groups of ordinary substitutions 
yield the same G. 

Actually, we can easily obtain the desired information concerning the abstract containing 
groups, and so the order of G, from any containing groups. We shall consider our general H, K 
case. Let F be a corresponding fundamental m-group, F*’ any containing group of F. The sub- 
groups of F*’ generated by the elements of H and K respectively will then be containing groups 
H*’ and K*’ of Hand K. In the above case of m-adic groups of ordinary substitutions, F*’ may 
be the ordinary substitution group generated by the substitutions of F, in which case H*’ and 
K*’ will be the ordinary substitution groups generated by the substitutions of H and K, and 
so obtainable without the explicit use of F*’. Let H*’, K*’, F*’ be of indices 1, p2, p. All three 
indices will then be divisors of m—1. Furthermore, it is readily seen that p: and p2 will be mul- 
tiples of p. Now if the cosets into which these containing groups are broken up are cyclically 
repeated until there are m—1 of each, the ith cosets of H*’ and K*’ wi'l be contained in the 
ith coset of F* for i=1,2,--- ,m—1. In particular, the (m—1)-st cosets will be the associated 
ordinary groups Hy’, Ky’, Fo’. And in the simple isomorphism between Fy’ and Fo, the abstract 
associated ordinary group of F, the subgroups Ho’ and Ky’ will correspond to Hy and Ko. Hence, 
the index under Ko of the crosscut of Hy and Ko is also the index under Kyo’ of the crosscut of Ho’ 
and K»', where the latter may now be considered the p;th and peth cosets in H*’ and K*’. As 
for x, note that two products of i elements each taken from an m-group will be identical in a 
containing group of an m-group when and only when those two i-ads of elements are equivalent. 
Hence, the smallest value of i for which H* and K* have a common 7-ad is also the smallest 
value of 4 for which H* and K*’, repeated as above, have a common i-ad. Actually, the pairs 
of ith cosets of H*’ and K*’ start repeating after i=L.C.M. (91, p2). Hence, « may be found 
from H*' and K*’ if their cosets be cyclically repeated to a total number equal to L.C.M. (1, 2) 
each. Clearly « is a divisor of L.C.M. (1, p2). If desired, it is not difficult to give a number 
theoretic expression for x in terms of the distribution of (7, j)’s for which an i-ad of H*’ 
and a j-ad of K*’ in their unrepeated form are identical. 

The order of G is thus determinable from H*’ and K*’. Explicitly F*’ does not enter. Hence, 
in the case of H and K m-adic groups of ordinary substitutions, no further reference need be 
made to F. In particular, then, if H*’ and K*’ are each of index m—1, the order of G is found 
exactly as if they were the abstract containing groups of H and K. 


304 E. L. POST [September 


is not in H, then the order of G is equal to the order of H multiplied by a divisor, 
not unity, of m—1. 

26. m-adic groups of order g prime to m—1. Let G be any m-group whose 
order g is prime to m—1. The order of any element s of G, being a divisor of g, 
will then also be prime to m—1. The cyclic m-group generated by s therefore 
has one and only one first order element So, i.e., s generates one and only one 
first order element So. G therefore has at least one first order element; and if it 
has exactly X first order elements, all of its elements can be separated into X 
corresponding mutually exclusive classes of elements, each class consisting of 
all the elements of G which separately generate the corresponding first order 
element. Now no first order element of G can transform another first order 
element of G into itself. For otherwise, by the first of the two corollaries of 
the last section, the two would generate a subgroup of G whose order would 
be a divisor, not unity, of m—1. But, as in the case of an element of G, the 
order of any subgroup of G must be prime to m —1. It follows that if element s 
of G generates the first order element so, and hence transforms Sp into itself, 
it can transform no other first order element of G into itself; for otherwise so, 
a power of s, would transform that other first order element into itself. The 
class of elements of G each generating so therefore consists of all the elements 
of G which transform So into itself, and hence constitute a subgroup of G. As 
this subgroup has so for invariant first order element, it is reducible to an 
ordinary group. We have thus proved that if G is an m-group whose order is 
prime to m—1, the elements of G can be separated into a number d of mutually 
exclusive subgroups of G, all reducible to ordinary groups, where d is the number 
of first order elements of G, and each subgroup contains one and only one first 
order element of G, and, indeed, consists of all the elements of G that transform 
that first order element into itself. 

Other immediate consequences of the above proof are the following. G is 
reducible to a 2-group when and only when it has but a single first order element. 
If G has more than one first order element, it has no invariant element, and hence 
is not derivable from a 2-group. In particular, every abelian m-group whose order 
is prime to m—1 has one and only one first order element, and hence is reducible 
to a 2-group("). 

We may note in passing the marked simplicity, from the standpoint of 
polyadic theory, of those m-groups of order prime to m —1 which are reducible 
to 2-groups. As seen below, the one first order element of such an m-group . 
is also the one and only first order element of each of its subgroups. These 
subgroups are therefore also reducible to 2-groups. Furthermore, both the 
group and its subgroups are reducible to 2-groups in one and only one way. 
It easily follows by a slight modification of our cyclic m-group argument that 
when the above m-group is reduced to the 2-group, its subgroups are reduced 
to the subgroups of that 2-group. 


(77) This generalizes a theorem of Lehmer on abelian 3-groups. 


1940] POLYADIC GROUPS 305 


Returning to our arbitrary m-group G of order g prime to m—1, we pro- 
ceed to show that the first order elements of G, as well as the corresponding d 
subgroups into which G was decomposed, constitute a complete set of conjugates 
under G. It will then follow that these \ subgroups are all of the same order, and 
hence that the number of first order elements of G is a divisor of the order of G("*). 
Let sé’, -- s® be the first order elements of G, hi, ke, , &y the orders 
of the corresponding \ subgroups of G. Since exactly k; elements of G trans- 
form s\ into itself, s\ is transformed into g/k; different elements by all the 
elements of G. As the transform of a first order element is also of the first 
order, g/k;SX, ie., Since g=kitke+ +k, it follows that the 
equality sign must hold for each i. Each s{ therefore has the X first order 
elements of G for its different transforms under the elements of G, whence 
the first half of our theorem. Now if s; generates the first order element s{”, 
say then (s~!s,s) = 5-15; that is, the transform of s; 
under s generates that first order element which is the transform of s\ un- 
der s. Hence, if element s of G transforms s{° into s”, it transforms the sub- 
group corresponding to s{” into the subgroup corresponding to s”, whence 
the rest of our result. 

It follows from the above that the X first order elements of G also consti- 
stitute a complete set of conjugates under the m-group they generate. For 
that m-group will have an order prime to m—1, while its first order elements 
will be the A first order elements of G. Since the m-group generated by a given 
set of elements chosen from a finite m-group will actually consist of all ex- 
tended products of elements chosen from the set, it follows that the d first 
order elements of G constitute a “generalized” complete set of conjugates under 
themselves, that is, each can be obtained from any other by a succession of 
transforms by first order elements only. Actually, this statement is weaker 
than the one immediately preceding, since it amounts to saying that the A 
first order elements of G constitute a complete set of conjugates under any 
containing group of the m-group they generate. In any case, the question 
whether they constitute a complete set of conjugates under themselves, in 
the sense that any one can be transformed into any other by a third, is left 
open(*), 

We have already observed that Xd is a divisor of g. While it is therefore 
prime to m—1, we now find an additional restriction imposed upon it by 
m—1. The first order element sf is of course invariant under itself. On the 


(78) For, if k is the common order of these \ subgroups, g=kd. That the number of first 
order elements of an arbitrary m-group need not be a divisor of its order is illustrated by the 
ordinary symmetric group of degree three extended to a 3-group. This 3-group of order six has 
four first order elements. 

(7°) Note that the statement of Miller, page 30 of Finite Groups, to the effect that the Sylow 
subgroups of order $8 of a group constitute a complete set of conjugates under themselves must 
also be interpreted in the above sense of a generalized complete set of conjugates. At least, that 
is all the proof there given allows us to infer. 


306 E. L. POST [September 


other hand, since any other first order element s{° is not invariant under s¢, 
it will be transformed by the polyads {sé }, {sé,s¢}, {sé, sd, sd },--- into 
a number, not unity, of first order elements which either directly, or by our 
general theorem on transforms, is seen to be a divisor of m—1. Since the sets 
of transforms of different s{’s by the above polyads are mutually exclusive 
when not identical, a separation of the \ first order elements into mutually 
exclusive classes is thus effected, one class consisting of but one element, 
every other class of a number of elements which is a divisor, not unity, of 
m—1. Hence, if pi, po, --- , p, are the distinct prime divisors of m--1, d is of 
the form \=1+kipitkepet +k,p,, ki: 20. In particular, if m—1 is a 
power of a single prime ~, the number of first order elements of G is of the 
form \=1-+kp. While for y>1 the expression for \ gives information concern- 
ing small \’s only, every sufficiently large number being so representable, 
when m-—1 is a power of a single prime p the condition includes the condition 
X prime to m—1, and for p>2, is stronger than that condition. 

A peculiar property of the sets of transforms arising in the preceding proof 
is that each set, clearly invariant under s¢, in turn generates sj. More gen- 
erally, any set of first order elements of G which is transformed into itself by a 
first order element so of G in turn generates so. This result is itself an immediate 
consequence of the following. A first order element of G which transforms a sub- 
group of G into itself must be contained in that subgroup. The proof of the last 
result consists in noting that, otherwise, that first order element and the sub- 
group would generate a subgroup of G whose order was the order of the given 
subgroup multiplied by a divisor, not unity, of m—1. As for the result preced- 
ing, the subgroup of G generated by the given set, being consequently in- 
variant under so, must contain So. 

Since the order of a subgroup of G must also be prime to m—1, there will — 
be associated with every subgroup of G an existent subset of the d first order 
elements of G, namely, the set of first order elements of the subgroup. These 
“group-bearing” subsets of the \ first order elements of G can be independ- 
ently characterized as those existent subsets of the Xd first order elements 
which generate no other first order elements. By the reasoning of the preced- 
ing paragraph, a first order element which transforms a group-bearing subset 
of first order elements into itself must be contained in that subset. As the 
converse must also be true, it follows that a first order element, and hence 
indeed any element, of G either leaves both a subgroup of G and the set of. 
first order elements of that subgroup invariant, or else transforms neither 
into itself. Clearly, two subgroups of G have a common element when and 
only when their sets of first order elements have a common element. We 
finally note the following. If s\”, s, - -- , s? are the first order elements of 
some subgroup of G, then of all subgroups of G with exactly those first order ele- 
ments there is one contained in, and one containing each. The smallest subgroup 
is of course the crosscut of all the subgroups in question, and will indeed be 


1940] POLYADIC GROUPS 307 


the subgroup H generated by those first order elements(*). Now let K be the 
subgroup of G consisting of all the elements of G which transform H into 
itself. K will then contain all of the above subgroups. And since each of the 
first order elements of K transforms H into itself, they will all be in H, and 
hence will be the given first order elements. K is therefore that largest sub- 
group of our theorem. 

The above theory is significant only if there exist m-groups of order prime 
to m—1 with more than one first order element, and, preferably, not consist- 
ing wholly of first order elements. For odd m—1+1 such an m-group is fur- 
nished by the complete m-adic 6-group which is of order 2”—' and has 2”~* 
first order elements. The 2”~* second order subgroups are then the corre- 
sponding mutually exclusive subgroups into which the elements of the group 
are separated. For m—1 even, and \ prime to m—1, the XA second order ele- 
ments of the ordinary dihedral group of order 2\ constitute such an m-group 
under the product of m elements as operation. In this m-group all \ elements 
are of m-adic order one. However, by the direct product method, we can ob- 
tain from this m-group, and a cyclic m-group of order g/A, an m-group of 
arbitrary order g prime to the even m—1, and with an arbitrary divisor A of g 
as the number of its first order elements. Most of the theory can be illustrated 
by means of these examples. 

27. Sylow subgroups of order p* with os p* prime to m—1. That Sylow’s 
theorem is not universally valid for polyadic groups is shown by cyclic poly- 
adic groups. We recall that a cyclic m-group of order g has a subgroup of 
order ¥, y a divisor of g, when and only when g/y is prime to m—1. Hence, 
if p is a prime divisor of g, and * is the largest power of p which divides 
g, a cyclic m-group of order g will have a “Sylow subgroup” of order »* when 
and only when g/* is prime to m—1. This example shows that our extension 
of Sylow’s theorem to polyadic groups as given below is the most general that 
can be given in terms of a condition involving only the order and dimension 
of the group(*'). Note also that our cyclic group will have a Sylow subgroup 
for each of two distinct prime divisors of g when and only when g itself is 
prime to m—1, in which case it will have a Sylow subgroup for every distinct 


(®) In this connection a theorem of Dérnte’s is of interest. To wit, if an m-group is semi- 
abelian, and has at least one first order element, then its first order elements themselves con- 
stitute a subgroup of the m-group. 

(*) Other theorems however are possible. Thus, if G is an m-group of order g whose associ- 
ated ordinary group Gp has but one Sylow subgroup corresponding to a prime divisor p of g, 
in particular if G is semi-abelian, then the necessary and sufficient condition that G have a 
Sylow subgroup corresponding to p is that G have at least one element whose order is a power, 
possibly the zeroth, of ». Necessary, immediately; and sufficient. For if Ho is that sole Sylow 
subgroup of Go of order a power of p, s the element of G, then s can transform Hp only into itself, 
while s*~1, being of ordinary order a power of p, must be in Hp. Hence H=Hps is an m-group, 
and thus a subgroup of G of the requisite order. However, the Sylow subgroups of G correspond- 
ing to the prime p need not then constitute a complete set of conjugates under G. Thus, if G’ is 


308 E. L. POST [September 


prime divisor of g. The same situation holds for the applicability of our ex- 
tension of Sylow’s theorem to polyadic groups. 

We proceed then to prove the following. If the order g of an m-group G 
is divisible by p* but not by p**1, p a prime divisor of g, then if g/p* is prime 
to m—1, G will have at least one subgroup of order p*. Our proof consists in 
expressing G in accordance with our basic coset theorem, and applying the 
Sylow theorem for ordinary groups to the associated ordinary group Gp of G. 
By that coset theorem, and in the notation of the abstract containing group 
G* of G, we may write G=s’Go, where s’ is any element of G. Since Gp is also 
of order g, it will have at least one Sylow subgroup A of order p*. As Go is 
invariant under s’, Ho will be transformed by s’ into a Sylow subgroup H7¢ 
of Go of order p*. But the Sylow subgroups of Go of order p* constitute a com- 
plete set of conjugates under Go. Hence some element ¢ of Go will transform 
Hj into Ho. It follows that the element s’’=s’t of G transforms H, into it- 
self. 

Now s”’ as element of G will be of some m-adic order y which is a divisor 
of g. If then p* is the largest power of p which divides y, y/p* will be prime to 
m—1. It follows from our theory of cyclic groups that s’’ will generate an 
element s, also in G, of m-adic order p*. That is, s as element of G* will be 
of ordinary order p*(m—1), and hence s”— of ordinary order p*. But Ho, be- 
ing invariant under s’’, must also be invariant under s, and hence under s”—!. 
Since s”—! of order p* is in Go, and transforms Sylow subgroup A of Go of 
order p* into itself, s"~! must be in Ho. It follows from the converse of the co- 
set theorem that H = Hos is an m-group, hence a subgroup of G, and of order p-. 

Our proof actually shows then that for each Sylow subgroup of order p* 
of Gy there is at least one “Sylow subgroup” of order p* of G whose associated 
ordinary group is that Sylow subgroup of Go. Conversely, the associated ordi- 
nary group of any subgroup of order p* of G will be a subgroup of order p* 
of Go, and hence a Sylow subgroup of order p* of Go. Since one and only one 
subgroup of Gy can be the associated ordinary group of a given subgroup of G, 
we thus see that there is a one-many correspondence thus set up between the 
Sylow subgroups of order p* of Go, and those of G. 

Of the three results which together constitute Sylow’s theorem for ordi- 


an ordinary abelian group, some extension of it G, also abelian, will consist wholly of first order 
elements. There will then be g/p* Sylow subgroups of G of order p*, yet each is invariant under 
G. 

Again, in attempting to generalize the standard substitution group proof of the existence 
of Sylow subgroups by means of m-adic substitution groups, the writer succeeded in construct- 
ing a Sylow subgroup corresponding to the prime p for any symmetric m-adic substitution 
group of degree a power of p. It may be of interest to note that the rest of that standard proof 
goes over except for the last step. This one point of failure, and failure there must be for an 
arbitrary m-group, lay in our being able to establish that the number of elements in a double 
coset H,sH: was the order of a subgroup of H; only for the case when H: and the transform of 
H, under s have a common element. 


1940] POLYADIC GROUPS 309 


nary groups we have therefore proved that the first, pertaining to the exist- 
ence of Sylow subgroups, go over for polyadic groups under the given order 
condition. We now show that under the same condition the third result also 
goes over. That is, under the condition of the preceding theorem the Sylow sub- 
groups of order p* of the m-group G constitute a complete set of conjugates under 
G. We have to show then that each subgroup of order p* of G can be trans- 
formed into any other by an element of G. Let H’ and H be any two such 
Sylow subgroups of G, Hj and H, the corresponding Sylow subgroups of Go. 
Some element ¢ of Go will transform Hg into Ho. That same ¢ will then trans- 
‘form H’ into a Sylow subgroup H”’ of G also corresponding to Hp, i.e., having 
Hy for associated ordinary group. If then we can show that some element s’ 
of G will transform H”’’ into H, it will follow that element s=ts’ of G must 
transform H’ into H as required by our theorem. 

Our problem therefore reduces to showing that of all Sylow subgroups H‘* 
of G corresponding to one and the same Sylow subgroup HA, of G, each can 
be transformed into any other by an element of G. Since H, is the associated 
ordinary group of each H®, it will be transformed into itself by the elements 
of each H“®. If then Gis the subgroup of G consisting of all the elements of G 
which transform H, into itself, each H‘® will be a subgroup of G. On the one 
hand, therefore, Lagrange’s theorem for polyadic groups shows that if 2 is 
the order of G, then 2 will be divisible by *, but not by p*+', while z/p- will 
be prime to m—1. On the other hand, since H» is invariant under each ele- 
ment of G, it will be an invariant subgroup of Go, the associated ordinary 
group of G. First then, Ho, whose order proclaims it to be a Sylow subgroup 
of Go, is the only Sylow subgroup of Gy of order p*. And since G satisfies the 
order condition of our first theorem, it follows from the proof of that theorem 
that the subgroups H“, which constitute all the Sylow subgroups of order p* 
of G, and hence of G, corresponding to Ho, actually are the only subgroups 
of order p* of G. 

If we expand G in cosets as regards Ho, each subgroup H“, having Ho 
for associated group, will appear as one of these cosets. Since H) is invariant ° 
under each element of G, these cosets are the elements of the m-adic quotient 
group ['=G/Hp. Ho then appears as the identity of Io, the associated ordi- 
nary group of I’, each H“® as an element o of I’. If s is an element of H“, 
s"-1 is in Hy. Hence for each a, [o‘® ]™-! =1. That is, each o® is a first order 
element of the m-group I’. Conversely, if be any first order element of I, 
the corresponding coset of G constitutes a subgroup of G with Hp for associ- 
ated group, and hence is an H“®, The elements o“® are therefore the only first 
order elements of I. But the order of T is Z/p* which is prime to m—1. The 
preceding section therefore tells us that the elements a“ constitute a com- 
plete set of conjugates under the elements of I’. It follows that each- of the 
subgroups H“® of G can be transformed into any other by an element of G, 
and hence of G. Our proof is thus completed. 


310 E. L. POST [September 


Clearly, the Sylow subgroups of order p* of G are also the Sylow subgroups 
of order p* of the subgroup of G generated by those Sylow subgroups. As that 
generated subgroup must satisfy the order condition of our theorem, it follows 
that the Sylow subgroups of order p* also constitute a complete set of con- — 
jugates under the elements of the m-group they generate. As in the case of 
the preceding section, a weaker form of this result is that the Sylow subgroups 
of order p* of G constitute a generalized complete set of conjugates under 
their own elements, that is, each can be obtained from another by a succession 
of transforms by their own elements. 

Under the condition g/p* prime to m—1, two of the three parts of Sylow’s 
theorem have thus been shown to hold verbatim for polyadic groups. Not so 
for the remaining part concerning the number of Sylow subgroups of order p*. 
Let us return to the one-many correspondence between the Sylow subgroups 
of order p* of Go and of G. As stated in different guise in the preceding proof, 
an element ¢ of Go which transforms one Sylow subgroup of Go into a second 
will transform the Sylow subgroups of G corresponding to that first Sylow 
subgroup of Go into those corresponding to the second. Each Sylow subgroup 
of order p* of Go therefore has the same number \ of corresponding Sylow 
subgroups of G. As seen above, \ is actually the number of first order elements 
of an m-group of order 2/p* prime to m—1. Hence our result of the preceding 
section, coupled with the corresponding part of the Sylow theorem for ordi- 
nary groups, yields the following as the remaining part of our Sylow theorem 
for polyadic groups. Under the condition of the preceding theorems the number 
of Sylow subgroups of order p* of the m-group G of order g is of the form (1+kp)dx 
where d is a divisor of g/p* and hence prime to m—1 and p. 

In contrast with the above, we are able to extend the ordinary result that 
every element and subgroup of order a power of p is contained in a Sylow 
subgroup of order p*, only for several still narrower classes of polyadic groups. 
It will be convenient to refer to this as the inclusion property. We do have im- 
mediately that under the conditions of the preceding theorems if element s of 
order p® of G, B20, transforms a Sylow subgroup H of order p* of G into itself, 
then s is in H. For otherwise, by our generalized corollary of §25, s and H 
would generate a subgroup of G whose order would be either p* times a 
multiple of », or p* times a divisor, not unity, of m—1, neither of which possi- 
bility is consistent with the given conditions. Hence also, if each element of a 
subgroup K of order p* of G transforms H into itself, then K is contained in H, 
It follows that if G has but one Sylow subgroup of order *, in particular then 
if G is abelian, the inclusion property holds. Again, as in the proof of the first 
part of our extension of Sylow’s theorem, we see that if element s of order 
of G transforms a Sylow subgroup H of order p* of the associated ordinary 
group ‘Go into itself, then s must be in a Sylow subgroup of order p* of G, 
namely, Hos; likewise then for a subgroup K of order p* of G that transforms 
Hy, into itself. For Ko will then be contained in Ho; and with s in K, Sylow 


1940] POLYADIC GROUPS 311 


subgroup Hos of G will contain K = Kos. Hence, if Go has but one Sylow sub- 
group, in particular if Go is abelian, i.e., G semi-abelian, the inclusion property 
is satisfied. 

If we attempt to generalize the standard proof of the inclusion property 
for ordinary groups, we see that while the number of Sylow subgroups of order 
p* of the m-group G is shown by our formula to be again prime to p, our work 
on transforms merely shows the number of transforms of a Sylow subgroup 
under the polyads formed from s or K to be a divisor of p*(m—1). We are 
thus led to the inclusion property only when m—1 itself is a power of the 
prime p. More generally, however, let G be reducible to a u-group G’, with 
u—1a power of p, say p’. The abstract containing group G’* of G’, of order 
pg, will then be a containing group of G. The corresponding containing group 
of the cyclic m-group generated by s, or of K, will be a subgroup of G’*. It 
follows that the above number of transforms will also be a divisor of p7g, and 
hence actually be a power of p. The standard proof therefore again general- 
izes. Hence, under the condition of the preceding theorems the inclusion property 
holds whenever G is reducible to a p-group with 1 —1 a power of p; in particular, 
then, whenever G is reducible to an ordinary group. 

An interesting consequence of this result is that the inclusion property 
for G holds under the condition of this section whenever G has an invariant 
element. For let s be an invariant element of G. Since its m-adic order is a 
divisor of g, the condition g/p* prime to m—1, coupled with our formula for 
the real dimension of a cyclic m-group, shows that the cyclic m-group gen- 
erated by s is reducible to a u-group with u—1 a power of p. If then we apply 
our general criterion of reducibility to a u-group to this cyclic u-group, we 
obtain a condition which, with the invariance of s under G, becomes the con- 
dition that G be reducible to a u-group. Note that in this case, which is that 
of a G derivable from a 2-group, for each Sylow subgroup of order p* of Go 
there is but one corresponding Sylow subgroup of G. For the invariant ele- 
ment s will generate some invariant element of order a power of p, which, 
consequently, must be in every Sylow subgroup of order p* of G. On the other 
hand two Sylow subgroups of G corresponding to the same Sylow subgroup 
of Go can have no common element. 

All of the above concerned the Sylow subgroups of G corresponding to the 
single prime p. As stated early in this section, if the condition g/p* prime to 
m—1 is to be satisfied for two distinct prime factors of g, then g itself must 
be prime to m—1, in which case the condition is satisfied for every prime fac- 
tor of g. Hence, when g is prime to m—1, our extension of Sylow’s theorem is 
universally valid. In particular, if G is abelian with g prime to m—1, then G 
has one and only one Sylow subgroup for each distinct prime divisor of g. 
By the preceding section, G then has one and only one first order element, 
which must then be in each of the Sylow subgroups of G, and, indeed, be the 
only element common to one such subgroup and the subgroup generated by 


312 E. L. POST [September 


the others. G, therefore, is then the direct product of its Sylow subgroups; 
and when it is reduced to a 2-group, in the one manner allowed by its unique 
first order element, its Sylow subgroups are reduced to the Sylow subgroups 
of that 2-group. 

Actually, this last result is but a special instance of a general result. We 
have earlier observed that when an m-group G is reduced to a u-group G’, 
each subgroup of G’ is the reduction of a subgroup of G, but a subgroup of G 
may not reduce to a subgroup of G’. On the other hand, let G satisfy our gen- 
eral condition g/p* prime to m—1. Then G’ satisfies the corresponding condi- 
tion g/p* prime to u—1. Our extension of Sylow’s theorem is therefore ap- 
plicable to both groups. Since transforms of elements by elements are the 
same in G and in G’, our complete set of conjugates result, applied to a Sylow 
subgroup of order p* of G’ and that of G reducing to it, shows that when G is 
reduced to G’ the Sylow subgroups of order p* of G are reduced to the Sylow sub- 
groups of order p* of G’. Finally, if m—1 is prime to g, the Sylow subgroups 
of G, without qualification, are reduced to the Sylow subgroups of G’. 

28. Representation of an arbitrary m-adic group as a regular m-adic sub- 
stitution group. We shall prove our result without the use of the coset theo- 
rem. The proof will then, indeed, immediately lead to another proof of the 
coset theorem, actually, the writer’s original proof(*®). 

Let G be an arbitrary m-group of order g. The classes T1, - - - , are 
then to have for members the g classes of equivalent i-adsfori=1,2,---,m—1. 
It will be convenient fo symbolize the g members of I; by a;;, 7=1, 2,---,g. 
Let s be any element of G. Then, as proved in more general form in §3, if 
the i-ads {s/,s/,---,s/} and {s/’, s{’,---,s/'} of G are equivalent, the 
(t+1)-ads {si, s} and {s{’, s} of G are equiva- 
lent, and conversely. s thus becomes an operator which carries the g classes 
of equivalent i-ads in 1-1 fashion into the g classes of equivalent (1+1)-ads. 
Furthermore, if c represents the m-adic operation of G, then if the (m—1)-ads 
Sofa} and {s{’, s{’,---, are equivalent, the elements 
c(si sd +++ Sm—sS) and c(s{'s¢’ - - - s}’_,s) are identical, and conversely. It fol- 
lows that s thus carries in 1-1 fashion the letters of T;--Ts, T:-Ts, ---, 
T',n-1—1T1, that is, determines an m-adic substitution on the I’’s. 

Now given any i-ad {s1, si}, and any (i+1)-ad {si, 
s!, s/41}, there is one and only one element s of G for which the (i+1)-ads 
{s1, and are equivalent. It follows on the 
one hand that no two distinct elements of G can yield the same m-adic sub- 
stitution on the I'’s. The correspondence between the elements of G and the 
m-adic substitutions they determine is therefore 1-1. And since the m-adic 
substitution determined by c(si52 - - - Sm) is clearly the product of the m-adic 
substitutions determined by 51, Se, - - - , Sm, it follows that the m-adic substitu- 


(*) While the proof as given is for finite m-groups, it holds with little change for all m- 
groups. Hence the full generality of the consequent proof of the coset theorem. 


1940] POLYADIC GROUPS 313 


tions determined by the elements of G constitute an m-adic substitution group 
simply isomorphic with G. Furthermore, the initial observation of this para- 
graph shows that given any two letters in successive I'’s there is one and only 
one element s of G, and hence one and only one m-adic substitution of the 
simply isomorphic substitution group, that carries the letter in the first T into 
that of the second. This m-adic substitution group is therefore regular. We 
have consequently proved the following generalization of Cayley’s theorem. 
Every m-adic group can be represented as a regular m-adic substitution group. 
In this connection, as seen in §16, the argument of §14 shows that two regular 
m-adic substitution groups on the same letters which are simply isomorphic are 
conjugate. 

If we now wish to obtain the coset theorem from this result, we need 
merely observe that the ordinary group generated by the m-adic substitutions 
of the representation of G, as in the case of all m-adic substitution groups, 
is a containing group of the representation of G of index m—1, and hence by 
resymbolization of its elements can be made a containing group of G leading 
to the desired result. Since we have developed our theory of abstract polyadic 
groups abstractly, comparatively few applications of this generalization of 
Cayley’s theorem are to be found in the present paper. Perhaps the most im- 
portant of these is that it allows the concept of holomorph to apply to an arbi- 
trary abstract polyadic group. 

29. Invariant subgroups and quotient groups; the m-adic central quotient 
group. The present section may be considered a continuation of §4, our at- 
tention now being restricted to finite polyadic groups. We recall that if G is 
an m-group with ordinary associated group Go, then every subgroup Hp of Go 
that is invariant under G leads to an m-adic quotient group Q=G/H, iso- 
morphic with G. Clearly, if Ho is of order h, the isomorphism between G and 
Q is (h, 1). Ho and Q may be called complementary groups as regards G. Since 
the elements of Q are the cosets of G as regards Ho, the order of G is the prod- 
uct of the orders of Hy and Q. Similarly for an actual subgroup H of G corre- 
sponding to Ho. 

Let o be any element of Q, s any one of the elements of the corresponding 
coset. Then the m-adic order n of s must be divisible by the m-adic order v 
of o. For, since s!"]=s, g!"l=g¢, and hence ” is a multiple of v. That is, the 
order of any element of an m-adic quotient group divides the orders of all the ele- 
ments of the corresponding coset. We recall that each coset corresponding to a 
first order element of Q constitutes a subgroup of G. These subgroups in fact 
are all the subgroups of G having H) for associated ordinary group, and hence 
also are semi-invariant subgroups of G. In particular, if Ho is of order prime 
to m—1, each coset thus corresponding to a first order element of Q has at 
least one first order element. 

Unlike the corresponding situation for ordinary groups, an element ¢ of Q 
may be of order a power of a prime » without any element of the correspond- 


314 E. L. POST [September 


ing coset being of order a power of that prime. Thus, let G be a cyclic m-group 
of order p*k where k, prime to p, is not prime to m—1. Then no element of G 
can have an order a power of p. But with H» the subgroup of Gp of order k, 
Q=G/H) is cyclic, and of order p*. Some element ¢ of Q will then indeed be of 
order p*, while the corresponding coset has no element of order a power of p. 

However, let o be of order p*, Ho of order p*k, k prime to p, and suppose 
that k is prime to m—1. The elements of the cosets corresponding to the 
m-adic powers of o will then together constitute a subgroup G’ of G of order 
p*t8k. Since k is prime to m—1, G’ will have a Sylow subgroup K of order 
p*+8(83). As the crosscut of Ky and H» must be of order a power of , it follows 
that K must have exactly p* elements in each of the p* cosets of G’ as regards 
Hy). The coset corresponding to ¢ therefore has at least one element of order p” 
with, of course, y2f. That is, if the order of an element of an m-adic quotient 
group is a power of a prime number p, while the largest divisor prime to p of the 
order of the complementary group is prime to m—1, then the corresponding coset 
involves an element whose order is a power of p. 

We recall the ordinary group result that every invariant subgroup of in- 
dex 2 under any group includes all the elements of odd order contained in 
this group. In the case of an m-adic quotient group of order two, we recall 
our results of §23, and note that for m odd no such result can be expected. 
In fact, when the quotient group consists of two first order elements, each 
of the corresponding cosets, both then invariant subgroups of the given group 
as a consequence of the abelianism of the quotient group, may have an ele- 
ment of odd order; while when the quotient group consists of two second order 
elements both cosets consist of even order elements only. On the other hand, 
for m even the quotient group must consist of one first and one second order 
element. The coset corresponding to the first order element of the quotient 
group will then be an invariant subgroup of the given group, and any ele- 
ments of odd order in the given group must be included in that invariant 
subgroup. 

If Ho is a subgroup of Go, the index of Hy under G may be defined as the 
order of G divided by the order of Ho, and, of course, gives the number of 
cosets in the expansion of G in either right or left cosets as regards Hy—like- 
wise for an H actually a subgroup of G. In the case of ordinary groups, we 
know that the index of the crosscut of two subgroups of a group under one 
of those subgroups is less than or equal to the index of the other subgroup 
under the group; while if the two subgroups are conjugate under the group, 
the inequality always prevails. If now H is a subgroup of an m-group G, Koa 
subgroup of Go, let Lo be the crosscut of the associated ordinary group Ho 
of H, and Ko. Then, by writing G in the form Gos, with s in H, we see that the 
expansion of Hp in right cosets as regards Lo, and the expansion of Gp in right 


(%) Unless a=8=0. But that case has already been treated. Actually, the first order ele- 
ments of G may then conveniently be considered its Sylow subgroups of order p°. 


1940] POLYADIC GROUPS 315 


cosets as regards Ko, become the expansions of H and G in right cosets as 
regards Ly and Ko respectively. It then follows immediately that the index 
of L» under H is less than or equal to the index of Ky under G. Now let Ko 
be the associated ordinary group of a subgroup K of G conjugate to H under 
G. Since H and K are subgroups of G, we see from the discussion in §24 that H 
can also be transformed into K by some element ¢ of Go. Since ¢ then trans- 
forms Hy into Ko, the 2-group result for conjugate subgroups is applicable 
and thus yields the following. If H and K are conjugate subgroups of an m- 
group G, the index of the crosscut of Hy and Ky under one of the subgroups is 
always less than the index of these subgroups under G. 

In this formulation we use “subgroup” in the strict sense, and thereby 
avoid the need of specifying that Hp and Ko, or H and K, are distinct. Now, 
as in the corresponding 2-group illustration, let H be of index 2 unde~ G. 
With K conjugate to H, the above result shows Ho and Kp, to be identical. 
H is then at least a semi-invariant subgroup of G. But since the resulting 
quotient group G/H, being of order two, is abelian, it follows that H is ac- 
tually invariant under G. Hence, as for ordinary groups, a subgroup of index 2 
under any polyadic group is invariant. 

If an m-group G has at least one invariant element, these invariant ele- 
ments clearly constitute an invariant subgroup of G which may be called the 
central of G. Note that a necessary and sufficient condition that our finite 
m-group G have a central is that it be derivable from an ordinary group. The 
central C of G, when it exists, is of course abelian, and coincides with G when 
and only when G is abelian. The quotient group G/C may be called the central 
quotient group of G, and, as with ordinary groups, is easily proved noncyclic 
whenever G is non-abelian. 

It is readily seen that, when the central C of G exists, the associated ordi- 
nary group C> of C consists of all the elements of Go which are invariant un- 
der G. In general then, let us define the associated central Cy of G as the sub- 
group of Gp consisting of all the elements of Go invariant under G. Cy then 
always exists, and being a subgroup of Gp invariant under G, always leads toa 
quotient group G/Co. Since G/C =G/C» whenever C exists, we may call G/Cy 
the central quotient group of G irrespective of the existence of C. Since each 
element of Cp is also invariant under Go, Co is a subgroup of the central of Go 
when it does not coincide with the central of Go. It is readily seen, in fact, 
that the central of Go is invariant under G, each element of G yielding the 
same automorphism of that central. It follows that Co consists of those ele- 
ments of the central of Gp which are left invariant under any one element of G. 
In particular, when C exists, Cy will coincide with the central of Go. In any 
case, Cy is abelian, and coincides with Go when and only when G is abelian. 
It is then again easily proved that the central quotient group of an m-group G 
is noncyclic whenever G is non-abelian. 

Any subgroup of G having C> for associated group leads to the central 


316 E. L. POST : [September 


quotient group G/Cy and may be called a relative central of G. The relative 
centrals of G are then those cosets, if any, of the expansion of G as regards Cy 
which correspond to first order elements of the central quotient group. They 
are of course semi-invariant subgroups of G, and are easily seen to be abelian. 
They can be independently characterized as the maximal subgroups of G hav- 
ing the property that, on being transformed by an element of G, each element 
of the subgroup is multiplied by one and the same element ¢ of Go. Together, 
the elements of the relative centrals of G constitute all elements s of G with 
s™-1 in Co. The relative centrals corresponding to invariant first order ele- 
ments of the central quotient group are characterized by the above multiplier 
t’s always being in Co, in which case, indeed, ™-'=1. The unique central C, 
when it exists, is then the only one for which ¢ is always 1. 

30. Commutator, semi-commutator, and quasi-commutator subgroups. A 
direct extension to polyadic groups of the concepts of commutator, and com- 
mutator subgroup, is immediately obtainable. Given an m-group G, and in 
the notation of the abstract containing group of G, if s; and sz are any two 
elements of G, we may, as in ordinary theory, define the commutator of s; 
and sz to be t=sy'sy1s\s2. We shall also refer to s; and sz as the elements of 
the commutator. The commutator of s; and s2 is then not an element of G, 
but of Go, the associated ordinary group of G, and is indeed that element of Go 
by which s, has to be multiplied on the right to yield the transform of s; un- 
der se. The different commutators thus formed from elements of G therefore 
generate a subgroup of Go, if not Go itself, which may then be called the com- 
mutator subgroup for G. 

As in ordinary group theory, the theory of commutator subgroups for 
polyadic groups is intimately bound up with the property of abelianism. But 
now our general formulation of semi-abelianism given in §7 suggests the need 
of a corresponding formulation of semi-commutator subgroup. The relative 
complexity of the resulting formulation then suggests a still further general- 
ization of both concepts to what we term quasi-abelianism, and quasi-com- 
mutator subgroup. This wider generalization is also significant for ordinary 
groups. But while thus intimately related to certain recent work, in particu- 
lar of Hall and Neumann(*), its direction seems to be new. 

The immediate connection between abelianism and commutator subgroup 
is more clearly in evidence if we rewrite the usual s,s2= 525; for the former in 
the equivalent form sj'sz's;s2= 1. Now the expression sj'sz's152 that thus en- 
ters into both concepts is but a special instance of a word in the sense of Hall, 
or a rational expression in the sense of Baer. In general, a word W will be 
any expression of the form sjisj; - - - s{%, where the exponents are arbitrarily 
+1 or —1, the subscripts arbitrarily equal or unequal. If such an expression 
is to assume the value 1 for any choice of s’s in an m-group G, the notation 


(*) B. H. Neumann, Identical relations in groups 1, Mathematische Annalen, vol. 114 
(1937), pp. 506-525. References will here be found to the work of Hall. 


1940] POLYADIC GROUPS 317 


being that of the abstract containing group of G, the exponents must satisfy 
the condition ».+2+ --- +vy=0 (mod m—1). Given m, consider then any 
specific class of words W; whose exponents satisfy this condition. An m-group 
G will then be said to be quasi-abelian of corresponding formal type if the 
equations W;=1 are satisfied for every assignment of elements in G as values 
of the s’s, i.e., form a set of identical relations for G in the sense of Neumann. 
Now given an arbitrary m-group G, as a result of the exponent condition on 
the given class of words W; each word assumes an element of Go as value when 
its letters are assigned elements of G as values. We shall call these words 
formal quasi-commutators, their values quasi-commutators, of the given for- 
mal type. The subgroup of Gp generated by all of the quasi-commutators thus 
obtainable from elements of G will then be called the quasi-commutator sub- 
group for G of corresponding formal type. 

In particular, any formulation of semi-abelianism as given in §7 can be 
- rewritten in the above form. We correspondingly have formal semi-commuta- 
tors, semi-commutators, and semi-commutator subgroup for an m-group G. 
While a certain degree of arbitrariness enters into the manner in which the 
equations of §7 are thus rewritten, it will be seen that this is irrelevant in 
the formation of the corresponding semi-commutator subgroup for G. In fact, 
our central theorem will be to the effect that the correspondence between type 
of quasi-abelianism and type of quasi-commutator subgroup, at present 
purely formal, is in fact intrinsic(®). 

Our initial development, paralleling that of ordinary theory up to its main 


conclusion, will be given for quasi-commutator subgroups, the results then 
also holding for the successive specialization to semi-commutator and com- 
mutator subgroups. Consider then any one formulation of quasi-commutator 
subgroup for m-groups. From its very definition we then have that the quasi- 
commutator subgroup for an m-group G reduces to the identity when and only 
when G is quasi-abelian of corresponding formal type. Clearly the transform W; 
by s is the same expression with each letter in W; replaced by its transform 


(*) Note that while we are interested in all, in the present instance finite, m-groups satis- 
fying a given set of identical relations, Neumann considered instead the class of all identical 
relations satisfied by a given, of course ordinary, group. But it is the former concept that gen- 
eralizes abelianism. Again, Hall, in the first paper cited by Neumann, builds up higher com- 
mutator forms merely out of ordinary commutators. His later concept of word-subgroup is 
identical, for ordinary groups, with our quasi-commutator subgroup. But again the emphasis 
is on all word-subgroups of a given group, rather than word-subgroup of given type for all 
groups—say of cardinal number less than, or less than or equal to, a given cardinal. And so our 
particular contribution of the relation between type of word-subgroup and type of identical 
relations is again unnoticed. We hasten to add that the researches of these authors in the direc- 
tions they do pursue are profound. We also note that on reading Neumann’s paper we changed 
our original formulation involving a finite number of identical relations to an arbitrary set of 
identical relations. In the case of our formulation of semi-abelianism, the finite can stand; for 
our theorem of §7 shows that an infinite set would always be equivalent to a finite subset 
thereof. 


318 E. L. POST : [September 


under s. That is, the transform of each quasi-commutator by an element of G 
is also a quasi-commutator. Hence, the quasi-commutator subgroup for G of 
the given formal type is a subgroup of Go invariant under G, when not Go itself. 
We may therefore form the m-adic quotient group of G relative to this quasi- 
commutator subgroup, i.e., the corresponding quasi-commutator quotient 
group of G. We then readily see that as in the ordinary theory, the quasi- 
commutator quotient group of G of given formal type is quasi-abelian of the corre- 
sponding formal type. For the isomorphism between G and the quotient 
group shows that a quasi-commutator formed from any elements of the quo- 
tient group corresponds to the quasi-commutator formed in the same way 
from corresponding elements of G, and hence is always the identity. Con- 
versely, consider any quotient group of G which is quasi-abelian according to 
the given formulation. Again quasi-commutators of G correspond to quasi- 
commutators of this quotient group. Since the latter quasi-commutators can 
only be the identity, the former must be in the subgroup of Gp complementary 
to this quotient group. That is, every subgroup of Go which is invariant under G, 
and whose complementary quotient group is quasi-abelian of given formal type, 
contains the quasi-commutator subgroup for G of corresponding formal type. 

Weare now able to prove the following fundamental theorem. If two formu- 
lations of quasi-abelianism for m-adic groups are such that every m-group satisfy- 
ing either satisfies the other, then the corresponding quasi-commutator subgroups 
for an m-group are always identical. For let A’ and A’’ symbolize the two 
formulations of quasi-abelianism. If then, for a given m-group G, Cé and Cj’ 
are the quasi-commutator subgroups corresponding to A’ and A’’ respec- 
tively, the quasi-commutator quotient group satisfies A’, satis- 
fies A’’. By our hypothesis, therefore, the m-group G/Cj also satisfies A’’, 
G/C¢' also satisfies A’. Hence, by our last theorem, Cj contains Cj’ and Cj’ 
contains Cj, that is, Ci and C¢’ are identical. 

The converse of this theorem is immediate; for if two formulations of 
quasi-commutator subgroup lead to identical subgroups for each m-group, 
then, if either of these subgroups is the identity, the other also is the identity. 
If then we say that two formulations of quasi-abelianism for m-adic groups 
define the same type of quasi-abelianism if every m-group satisfying either 
satisfies the other, while two formulations of quasi-commutator subgroup for 
m-adic groups define the same type of quasi-commutator subgroup if they yield 
identical subgroups for each m-group, we can conclude that there is a 1-1 cor- 
respondence between types of quasi-abelianism for m-adic groups and types of 
quasi-commutator subgroup. The correspondence between quasi-abelianism 
and quasi-commutator subgroup, originally depending on a particular formu- 
lation, has thus been shown to be intrinsic. 

A useful partial consequence of our earlier proof is the following. If two 
formulations of quasi-abelianism for m-adic groups are such that every m-group 
satisfying the first satisfies the second, then the quasi-commutator subgroup for an 


1940] POLYADIC GROUPS 319 


m-group corresponding to the first formulation always contains the one corre- 
sponding to the second. In this connection note that quasi-commutator sub- 
groups of different types may be identical for a particular m-group. We there- 
fore pause to prove the following. Given any finite set of distinct types of 
quasi-abelianism, there exists an m-group for which the corresponding quasi- 
commutator subgroups are all distinct. In fact, for each pair of these types 
there must exist an m-group quasi-abelian according to one type, but not 
according to the other. Represent these m-groups say as m-adic substitution 
groups on different letters, and form the m-group G therefrom by the direct 
product method. G then has the desired property. For it is readily proved 
from commutativity considerations that each quasi-commutator of G is the 
product of quasi-commutators of the same form, one for each of the above 
constituent groups of G, and conversely. Hence the quasi-commutator sub- 
groups for G corresponding to any two of the given types of quasi-abelianism 
have, on the letters of the corresponding constituent group of G, a constituent 
group which is the identity in one case, not the identity in the other, and 
hence are themselves distinct. 

Our basic “equivalence theorem” immediately translates our determina- 
tion of the distinct types of semi-abelianism effected in §7 into a determina- 
tion of the distinct types of semi-commutator subgroup. Since the proof of 
distinctness for the former was carried through by means of finite groups, we 
can therefore state that there are as many distinct types of semi-commutator 
subgroups for.m-adic groups as there are distinct divisors of m—1. For a divisor 
p of m—1, the semi-commutator subgroup corresponding to p-semi-abelian- 
ism may be called the p-semi-commutator subgroup. From the above more 
general result it follows that there exists an m-group for which the semi-com- 
mutator subgroups of all the distinct types are distinct. In this case a simpler 
example of such a group is obtained merely by taking the direct product of 
groups, one for each divisor p—1 of m—1, which, as in §7, are m-groups 
p-semi-abelian, but not p’-semi-abelian for any divisor p’—1 of p—1 other 
than p—1. Whether the semi-commutator subgroups of a given m-group are 
distinct or not, we may note the following relations between them. Since 
p1-semi-abelianism implies p2-semi-abelianism whenever p;—1 is a divisor of 
p2—1, it follows that in this case the p:-semi-commutator subgroup contains 
the p2-semi-commutator subgroup. More generally then, the crosscut of the 
p1 and p2-semi-commutator subgroups contains the p3-semi-commutator sub- 
group, where ps—1=L.C.M.(p:—1, p2—1), while the subgroup generated by 
the p: and p2-semi-commutator subgroups is contained in the p-semi-commu- 
tator subgroup, where p—1=H.C.F.(p:—1, p2—1). In the second case, how- 
ever, we can prove that the subgroup generated by the p: and p2-semi-commutator 
subgroups is the p-semi-commutator subgroup with p—1=H.C.F.(pi1—1, p2—1). 
For by the general theorem of §7, the semi-abelianism defined by the com- 
bination of p;-semi-abelianism and p2-semi-abelianism is equivalent to p-semi- 


320 E. L. POST , [September 


abelianism with the above p. The p-semi-commutator subgroup is therefore 
also the subgroup generated by all semi-commutators of the p: and pz formal 
types, and hence by the p; and p2-semi-commutator subgroups themselves. 

In our march to the equivalence theorem we neglected certain develop- 
ments related only to semi-commutators, or merely commutators, which 
might well have come first. In the limited generality of the first specialization 
we note that each semi-commutator subgroup for an m-group G contains the 
commutator subgroup of the ordinary associated group Go of G. In fact, if Ho 
be such a semi-commutator subgroup, the quotient group Go/Hp> can be iden- 
tified as the associated ordinary group of the semi-commutator quotient 
group G/Hp. Since G/Hp is semi-abelian, Go/Ho, by a result of §7, is abelian, 
whence the above. 

Clearly, two elements of a polyadic group are commutative when and only 
when their commutator is the identity. As in the corresponding situation for 
ordinary groups, it is readily proved that if the elements of a commutator 
respectively belong to two invariant subgroups of a polyadic group, the com- 
mutator is contained in the crosscut of the associated ordinary groups of those 
subgroups. It follows that if two invariant subgroups of a polyadic group are 
such that their associated ordinary groups have only the identity in common, then 
every element of one of these subgroups is commutative with every element of the 
other. Since two subgroups having at least one element in common have as 
many elements in common as have their associated ordinary groups, the 
above result is in this case equivalent to the following. If two invariant sub- 
groups of a polyadic group have one and only one element in common, then every 
element of one of these subgroups is commutative with every element of the other. 
Actually, this special case is almost an immediate consequence of the corre- 
sponding ordinary theorem; for the one common element is then an invariant 
first order element of each of the subgroups, and hence of the polyadic group 
they generate(®), so that all three of these groups are reducible, and simul- 
taneously so, to ordinary groups. 

We have observed that the commutator of elements s; and 5s of G is the 
element of Go which must be multiplied into s,; to obtain the transform of s; 
under se. Hence the complete set of conjugates of s; under G can be obtained 
by multiplying s,; by commutators formed from elements of G. Since the com- 
mutator subgroup for G is invariant under G, it readily follows from this that 
all the transforms of an i-ad of G by polyads of G can be obtained by multiply-’ 
ing the i-ad by elements of the commutator subgroup for G. More specifically, 
it can be proved by way of the equivalence theorem that the transforms of an 
i-ad of an m-group G by the elements of G can be obtained by multiplying 
one such transform by elements of the p-semi-commutator subgroup for G, 
where p—1=H.C.F.(i, m—1); whence likewise for the transforms of the i-ad 
by the j-ads of G with fixed 7. It follows from this result that if G is p-semi- 


(*®) Their direct product, therefore, as defined in §25. 


1940] POLYADIC GROUPS 321 


abelian, all elements of G transform the i-ad into the same i-ad, as also do all 
j-ads with fixed j, a fact also easily shown directly. 

We have defined an m-group G to be simple if Gp has no subgroup other 
than the identity invariant under G. It follows then immediately that if a 
simple m-group G is not quasi-abelian of specified type, the corresponding 
quasi-commutator subgroup for G is identical with Go. If then, rather nar- 
rowly, we define G to be perfect if the commutator subgroup for G is identical 
with Go, it follows that every simple polyadic group of composite order is per- 
fect. For otherwise G would be abelian, while Go would possess a subgroup 
other than the identity, yet invariant under G. 

As in the case of ordinary groups, a subgroup of an m-group G may be 
called a characteristic subgroup of G if it corresponds to itself under every 
automorphism of G. Every automorphism of G determines an automorphism 
of Go. We may then define a subgroup of Gp to be an associated characteristic 
subgroup of G if it corresponds to itself under every automorphism of G. In 
the case of invariance, a subgroup of Gp invariant under G is always invariant 
under Go, but not conversely. Here the reverse situation holds. For clearly a 
characteristic subgroup of Go is also an associated characteristic subgroup 
of G, but not always conversely, as shown by the following example. The com- 
plete m-adic 6-group for m=3 is a triadic group of order four which has 
exactly two second order subgroups, one cyclic, the other non-cyclic. Each 
of the subgroups is therefore a characteristic subgroup of the group. Evi- 
dently the associated ordinary group of any characteristic subgroup of a poly- 
adic group is an associated characteristic subgroup of the group. On the other 
hand, the associated ordinary group of this triadic 5-group is the ordinary 
axial group, and hence itself has no characteristic subgroup of order two. 

It is readily proved that if G is non-abelian, then the central of G, if exist- 
ent, is a characteristic subgroup of G, while the associated central of G is an 
associated characteristic subgroup of G. We now observe that every quasi- 
commutator subgroup for G, when not identical with Go, is an associated 
characteristic subgroup of G. In fact it is readily seen that under any auto- 
morphism of G a quasi-commutator involving certain elements of G will cor- 
respond to a quasi-commutator of the same form involving the corresponding 
elements of G. As the first set of elements take on all values in G, so do the 
second, so that actually the set of quasi-commutators of G of given formal 
type corresponds to itself under the automorphism. 

Granting that the concept of quasi-abelianism and quasi-commutator 
subgroup has a certain degree of generality, ever further generalizations sug- 
gest themselves(®’). Perhaps a guiding principle in such generalizations might 


(8) Thus, if the above concepts be termed categorical, the following generalization, which 
we give only for ordinary groups, can be effected. With each of a given class of words W; 
is associated a class of words Wj, involving only the letters of W;. A group G will then be condi- 
tionally quasi-abelian of corresponding formal type if each W;=1 is satisfied for every assign- 


322 E. L. POST : [September 


be the existence of an equivalence theorem. It may then be of interest to pre- 
sent our equivalence theorem in the following light. Each type of quasi-com- 
mutator subgroup for m-groups may be thought of as a function which 
assumes for each m-group G a subgroup of Go, if not Go, as value. Our equiva- 
lence theorem then asserts that this function is completely determined when 
it is knowh for what values of its argument it assumes the value 1. 

31. The ¢-subgroup of an m-adic group. The concept of a set of elements 
of a group being a set of independent generators of the group is equally ap- 
plicable to a polyadic group. Whereas an ordinary group always has at least 
one element, namely the identity, which can never be one of a set of inde- 
pendent generators of the group(**), this need not be so in the case of a poly- 
adic group. Thus a cyclic m-group of order g such that each prime divisor of g 
divides m—1 can be generated by any one of its elements, and hence fails to 
possess an element of the type in question. If, however, an m-group G has 
at least one element which cannot be one of a set of independent generators of 
the group, then the set of all such elements constitutes a characteristic sub- 
group of G which may be called the ¢-subgroup of G. It is a mark of the gen- 
erality of the concept of the ¢-subgroup that the self-same proofs which yield 
the corresponding results for ordinary groups apply verbatim to polyadic 
groups to give the following. The o-subgroup of an m-group G is the crosscut 
of all the maximal subgroups of G. If the o-subgroup of an m-group G involves a 
non-invariant element or subgroup, the number of conjugates under G of this 
element or subgroup is greater than the number of the corresponding conjugates 
under the o-subgroup. As an application of the first of these results we may 
note that if a cyclic m-group G is of order g = p{'p>" - - - poy, the p’s being the 
distinct prime divisors of g not divisors of m—1, then the ¢-subgroup of G 
exists if there be at least one such prime , and is then the subgroup of order 
py! pe vo. Hence also, if we continue forming ¢-subgroups start- 
ing with the cyclic m-group G, we finally arrive at the subgroup of order yo 
which has no ¢-subgroup. Since the ¢-subgroup is always a “proper” sub- 
group, if we start with any finite m-group and successively form ¢-subgroups, 


ment of elements in G as values of its letters for which each Wj, =1 is satisfied. Correspondingly, 
the conditional quasi-commutator subgroup of G is to be the smallest subgroup of G having the 
property that each W; is in that subgroup for every assignment of elements in G as values of its 
letters for which each Wj, is in that subgroup. Our development up to, and including, the 
equivalence theorem then goes over. But now symbolic logic suggests that our conditions might 
involve more explicitly its apparent variables and other apparatus, and our horizon keeps reced- 
ing. Thus, also, Neumann suggests the possibility of allowing constant elements of a group 
to enter into his identical relations, while Hall, in his higher commutator forms, from the start 
allows arbitrary subgroups of G individually to replace Gas domain of a corresponding variable. 
It may be that a postulational procedure, perhaps centering around our actual development, or 
around the point of view about to be suggested, would bring order out of the chaos that thus 
threatens. 
(88) Unless the group is the identity. 


1940] POLYADIC GROUPS 323 


we atrive at a subgroup whose ¢-subgroup is nonexistent. This weak state- 
ment, supported by the above example for m>2, contrasts with the case 
m = 2 when the last existent ¢-subgroup is always the identity. 

In applying the second of the above two general results to the Sylow sub- 
groups of the ¢-subgroup of an arbitrary m-group G, we are hampered by the 
order condition of our extension of Sylow’s theorem. Within the scope of that 
condition, we note first that if the ¢-subgroup of G is of order g’, and if, with 
p*’ the largest power of the prime p dividing g’, g’/p*’ is prime to m—1, then 
the ¢-subgroup has a Sylow subgroup of order p*’ which then, as in the ordi- 
nary case, is unique. If then g’ itself is prime to m—1, the ¢-subgroup will 
have one and only one Sylow subgroup for each distinct prime divisor of g’. 
Since, with g’ prime to m—1, the first order elements of the ¢-subgroup con- 
stitute a complete set of conjugates under the ¢-subgroup, it follows as for the 
Sylow subgroups that the ¢-subgroup then has one and only one first order 
element. That is, when the order of the ¢-subgroup of an m-group is prime to 
m—1, the ¢-subgroup is reducible to a 2-group. When so reduced its Sylow 
subgroups are reduced to the Sylow subgroups of the 2-group. As in the ordi- 
nary case, the ¢-subgroup is then the direct product of its Sylow subgroups. 

This result has an interesting consequence when the order of the given 
m-group is itself prime to m—1. The ¢-subgroup, if it exists, then has but one 
first order element. The invariance of the ¢-subgroup therefore entails the 
invariance of this first order element under the given m-group. But this can 
only be the case if the m-group has no other first order element. Hence, if an 
m-group of order prime to m—1 has more than one first order element, its o-sub- 
group is nonexistent; that is, if an m-group of order prime to m—1 is not re- 
ducible to a 2-group, each of its elements can be one of a set of independent 
generators of the group. On the other hand, if the m-group is reducible to a 
2-group, its sole first order element can be generated by any other element, 
and hence is in the consequently existent ¢-subgroup of the group. 

We restrict our discussion of the ¢-subgroups of primitive groups to primi- 
tive m-adic groups of ordinary substitutions. By the corresponding theorem 
of §18, the subgroups consisting of all substitutions omitting a given letter 
are maximal subgroups. Since these maximal subgroups can only have the 
identity in common, it follows that the d-subgroup of a primitive m-adic group 
of ordinary substitutions is either the identity, or else is nonexistent. Certainly 
then when the primitive group in question does not possess the identity, and 
hence a fortiori when it is not reducible to a 2-group, its ¢-subgroup is non- 
existent. Strangely enough, the same may be true even when the identity is 
in the primitive group, then consequently reducible to a 2-group. Thus, the 
ordinary cyclic substitution group of order and degree a prime p remains 
primitive when extended to a (p+1)-group. Yet, while the identity and any 
other element together generate the (p+1)-group, each alone generates only 
itself. 


324 E. L. POST : [September 


32. Simply isomorphic m-adic groups; group of inner isomorphisms. We 
have defined simply isomorphic m-groups in §4, and have shown there that 
the transform of an m-group by an element or polyad is an m-group simply 
isomorphic with the given m-group. Restricting our attention to the case 
when the simple isomorphism is an automorphism, i.e., between an m-group 
and itself, we then have conversely, as in the case of ordinary groups, that any 
automorphism of an m-group can be effected by transforming it by an ele- 
ment. This really means that an m-group can be found of which the given 
m-group is a subgroup and which has an element so transforming the given 
m-group. This result may be proved as in the ordinary case by representing the 
given m-group as a regular m-adic substitution group in accordance with §28. 
Then, by §16, the principal holomorph of the m-group so represented certainly 
transforms it into each of its possible automorphisms. 

Since the abstract containing group of an m-group is determined ab- 
stractly by the m-group, we see that a simple isomorphism between two 
m-groups determines a simple isomorphism between their abstract containing 
groups. Conversely, any simple isomorphism between the abstract contain- 
ing groups of two m-groups which makes the classes of elements of the 
m-groups correspond determines a simple isomorphism between the m-groups. 
The simple isomorphism theorem of §8 may be considered a refinement of 
this obvious result. As that theorem is related to the determination theorem 
preceding it, so the following theorems are related to two of the generation 
theorems of §25. Their proofs, easily supplied, are therefore here omitted. 

Two m-groups of the same order G’ and G"’ are simply isomorphic if their 
associated ordinary groups Gj and G4’ contain two simply isomorphic subgroups 
Hg and Hg' invariant under G' and G"’ respectively, while G’ and G"’ are gen- 
erated by Hé and Hé' and two elements s; and se such that if s{"~ is the smallest 
positive power of s™~* that occurs in Hé , then s{~" is the smallest positive power 
of st~ that occurs in Hé', and si™-”, s™-" correspond in the given simple 
isomorphism of Hj and Hg’. Moreover, it is assumed that s, and s2 transform 
corresponding generators of Hj, Hj' into corresponding elements in the given 
simple isomorphism. 

Two m-groups of the same order G, and Gz are simply isomorphic tf they con- 
tain two simply isomorphic invariant subgroups H, and Hz respectively, and are 
generated by these subgroups and two elements s, and s2 such that if s} is the 
smallest positive power of s; which occurs in the abstract containing group Ht 
of H,, then s) is the smallest positive power of s_ which occurs in the abstract con- 
taining group H# of Hs, and s} and s} correspond as a consequence of the given 
simple isomorphism of H, and He. Moreover, it is assumed that s1, $2 transform 
corresponding generators of H;, H2 into corresponding elements in the given sim- 
ple isomorphism. 

We have observed that cyclic m-groups of the same order are simply iso- 
morphic, and, obviously, no noncyclic m-group can be simply isomorphic 


1940] POLYADIC GROUPS 325 


with a cyclic m-group. The following is a rather interesting application of the 
simple isomorphism theorem of §8. Let G’ and G’’ be two m-groups of order g 
reducible to cyclic polyadic groups, and let element sf of G’ be of the same 
m-adic order as element sj’ of G’’. Then element s¢”—! of Gi is of the same 
ordinary order as element sj'"~' of Gj’. Since Gj and Gg’ are ordinary cyclic 
groups of order g, a simple isomorphism can be set up between them which 
makes sj”~! correspond to s/‘"~!. The theorem in question then yields the 
following result. If two m-groups reducible to cyclic polyadic groups are of the 
same order, and one m-group has an element of the same order as an element of 
the other, then the m-groups are simply isomorphic. 

Every automorphism of an m-group G permutes the elements of G accord- 
ing to a certain ordinary substitution. These substitutions clearly constitute 
an ordinary substitution group which may be called the group of. iso- 
morphisms of G. This terminology may be reconciled with that of §16 by 
noting that when G is represented as a regular substitution group, the corre- 
sponding (Ko) of §16 is simply isomorphic with the group of isomorphisms 
of G. 

On the other hand the substitutions which result merely from transform- 
ing G by its own elements need not form a 2-group. In fact, it is readily veri- 
fied that they do form an ordinary substitution group when and only when G 
has an invariant element. However they clearly do form an m-adic group of 
ordinary substitutions which may then be called the group of inner iso- 
morphisms of G. It is easily proved that as in the ordinary theory this m-group 
is simply isomorphic with the central quotient group of G. Hence it is simply 
isomorphic with G if and only if the associated central of G is the identity. 

By using the fact that every automorphism of G can be obtained by trans- 
forming it by some element, it is readily proved that the group of inner iso- 
morphisms of G is an invariant subgroup of the group of isomorphisms of G, 
if not identical with it, when the latter is extended to an m-group. On the 
other hand, the containing group of the group of inner isomorphisms is di- 
rectly an invariant subgroup of the group of isomorphisms, when not identi- 
cal with it. This containing group clearly consists of the substitutions accord- 
ing to which the elements of G are permuted when G is transformed by all 
of its polyads. 

In extending the Sylow subgroup property of the group of inner iso- 
morphisms of an ordinary group to m-groups, we have to restrict our m-adic 
G to be of order g with g/p* prime to m—1, p* being the largest power of the 
prime p dividing g. Since the order of I1:, the m-group of inner isomorphisms 
of G, divides g, I;, has the same order property. We can then show that Jy 
contains the same number of Sylow subgroups corresponding to p as G does, it 
being understood that if » does not divide the order of I, the corresponding 
Sylow subgroups of J: are its subgroups of first order. While the proof differs 
little from the corresponding ordinary group proof, we cannot follow Miller 


326 E. L. POST : [September 


in dismissing it with a line, and instead present it at least in outline. The 
elements of I,, corresponding to the elements of a subgroup H of G constitute 
a subgroup H’ of I, which may be called H’s corresponding subgroup. Let H 
be a Sylow subgroup of G for the prime in accordance with our hypothesis. 
Then, by considering J;; to be the central quotient group of G, and comparing 
the largest powers of p dividing the orders of H, I:, and Co with those divid- 
ing the orders of H, H’, and the crosscut of Hy and Co, we are enabled to 
conclude that H’ is a Sylow subgroup of J; for the prime p. Since correspond- 
ing elements of G and J; transform corresponding subgroups into correspond- 
ing subgroups, the relation between the Sylow subgroups of G for the prime p 
and their corresponding subgroups of Ji: is shown by the complete set of 
conjugates theorem to be a correspondence between all the Sylow subgroups 
of G, and all the Sylow subgroups of Ju, for the prime p. Finally, since any 
subgroup of G with given corresponding subgroup of J1: would be transformed 
into itself by any other subgroup of G with that corresponding subgroup of Iu, 
the above correspondence must be 1-1. 

The fact that the central quotient group of a non-abelian group cannot be 
cyclic leads in ordinary group theory to the result that the order of the group 
of inner isomorphisms of a non-abelian group is at least four. In the case of a 
non-abelian m-group, the same theorem, used in conjunction with our de- 
termination of the m-groups of the first three orders, shows that the least 
order of the group of inner isomorphisms of m-groups is at least two when 
m—1 is even, three when m—1 is odd but divisible by 3, four when m—1 is 
neither divisible by 2 nor 3. The following examples show that these actually 
are the least orders of Ji: for such m’s as well as the fact that the order of Iu 
may have any value from that least order up to and including the order four. 
First, by extending an ordinary group with J; of order four to an m-group, 
we see that for any m, I may be of order four. An J, of order three is 
immediately furnished for m—1 even by the non-abelian m-group of order 
three itself. For m—1 odd, but divisible by 3, we have the following ex- 
ample with m—1=3, and hence by extension for any m with m—1 divisible 
by 3. Let Go be the ordinary cyclic group of order nine generated by the cyclic 
substitution ¢ = 509070309). Then s = (dedsdg) transforms ¢ into 
t* while s*=1. G=Gos is then a 4-group of order nine. Since Gp is abelian, the 
associated central Cy of G consists of the elements of Go invariant under s, 
i.e., of 1, #, #8. The I of G is therefore also of order three. Finally an Iu of 
order two for m—1=2, and hence by extension for any even m—1, is ex- 
hibited by the following 3-group of order four. Let Gy be the axial group 1, 
(ab), (cd), (ab) (cd), s the substitution (ac)(bd). Since s transforms Go into it- 
self, while s?=1, G=Gos is a 3-group of order four. As s transforms but 1 and 
(ab)(cd) of Go into themselves, the Co of G, and hence also the J of G, is of 
order two. 

When J: is of order two it can abstractly be but the noncyclic m-group of 


1940] POLYADIC GROUPS 327 


order two with its two first order elements. G is correspondingly separated 
into two abelian subgroups of half its order. It is readily proved that every 
abelian subgroup of G is contained in one of these subgroups. Conversely, if 
non-abelian G can be separated into two abelian subgroups, its Jy, is of order 
two. 

When I is of order three, it can be but the non-abelian group when m—1 
is of the form 6u+2 and 6u-+4, the abelian noncyclic group when m—1 is of 
the form 6u+3, and either of these two when m—1 is of the form 6u+6 as 
shown by extensions of the cases where m—1=2 and 3. In any event J; con- 
sists of three first order elements, so that G is separated into three abelian 
subgroups of one-third its order. Again every abelian subgroup of G is con- 
tained in one of these three subgroups. We have not however been able to 
decide the question whether a non-abelian G which can be separated into 
three abelian subgroups of one-third its order must have J of order three. 

We restrict our discussion of I, of order four to m’s for which four is the 
least order of Iu, i.e., to m—1 not divisible by 2 or 3. Since m—1 is then 
prime to the order of J::, while the smallest prime divisor of m—1 cannot be 
less than 5, our seemingly trivial form for the number of first order elements 
of an m-group with m—1 prime to zg shows that J, has exactly one first order 
element. J; is therefore reducible to an ordinary group of order four, and in- 
deed to the axial group. Furthermore, the subgroups of J1: reduce to the sub- 
groups of the axial group when J;; is so reduced. It follows that G then has 
three abelian subgroups of half its order, while every abelian subgroup of G 
is contained in one of these subgroups. Conversely, if a non-abelian m-group 
with m—1 not divisible by 2 or 3 has more than one abelian subgroup of 
half its order, its J: is reducible to the axial group. 

33. Extension of Frobenius’s theorem to m-adic groups. Thanks to recent 
work of Hall(**) on a wide generalization of Frobenius’s theorem, the exten- 
sion of the original theorem of Frobenius to polyadic groups is immediate. 
A very special case of Theorem III of Hall’s paper may be stated as follows. 
If a subgroup H is transformed into itself by an element P, then the num- 
ber of solutions of X¥ =1 which lie in the coset HP is congruent to 0 modulo 
H.C.F.(N, h), where h is the order of H. Given, then, an arbitrary m-group G 
of order g, express G in the form G=Goso in accordance with our coset theo- 
rem. With a divisor of g, the elements s of G whose m-adic orders divide n 
are those for which s!) =s, i.e., s-"=1. Since Go is transformed into itself 
under so, the above special case of Hall’s theorem is immediately applicable 
to yield the following result. The number of elements of an m-group G of order g 
whose (m-adic) orders divide an arbitrary divisor n of g is, if not 0, not only a 
multiple of n, but of n H.C.F.(g/n, m—1). 

That the number in question may be 0 is shown by a cyclic m-group of 


(8) P. Hall, On a theorem of Frobenius, Proceedings of the London Mathematical Society, 
(2), vol. 40 (1935-1936), pp. 468-501. 


328 E,L. POST [September 


order g with g/n not prime to m—1. If y is any divisor of n, g/y will also fail 
to be prime to m—1, and the cyclic group has no elements of orders divid- 
ing n. Note that when g is prime to m—1 this can never occur, for our other- 
wise arbitrary G must then have at least one first order element. Actually, 
by applying the above result, restated for m not a divisor of g, to the con- 
jugate subgroups of G of §26—and for these subgroups, indeed, the result is 
easily obtainable with but the help of the ordinary Frobenius theorem—we 
obtain the following stronger result. If an m-group G is of order g prime to 
m—1, and n is any divisor of g, then the number of elements of G whose orders 
divide n is a multiple not only of n, but of n H.C.F.(g/n, d), X being the number 
of first order elements of G. 

34. Representation of an abstract m-adic group as a transitive (m, u) sub- 
stitution group. We shall consider the general question of representing an 
abstract m-group G of order g by a transitive m-adic group of yu-adic substitu- 
tions of degree n. (See §17.) The result can then immediately be specialized 
to the two cases of chief interest, 1=m and u=2, as well as to the case n=g, 
i.e., when the representing group is regular. © 

In the general case it is necessary to introduce polyadic groups intermedi- 
ate between G and its associated ordinary group Go, groups whose introduc- 
tion simultaneously with that of Go could have been used to generalize the 
theory at a number of points(**). Clearly each coset in the expansion of the 
abstract containing group G* of G as regards Gois a polyadic group of order g 
under suitable extensions of the dyadic operation of G*. In particular, if iis a 
divisor of m—1, the coset consisting of the i-ads of G, regarded as members 
of G*, will thus constitute a group of dimension (m—1)/i+1. It will suffice 
to refer to this group as the polyadic group G; of the i-ads of G. In particular 
Gi=G, Gm—1=Go. As in the case of the subgroups of G, we may identify (G;)* 
with the subgroup of G* generated by the elements of G;. (G;)o is then simply 
Go. Finally, since the isomorphism between G* and any other containing 
group of G established in §6 involves but a 1-1 correspondence between the 
elements of two corresponding cosets, G; may similarly be set up by means of 
any containing group of G. 

Suppose then that G can be represented by a transitive (m, wu) group G’ 
of degree 1, with, of course, 4—1 a divisor of m—1. Corresponding to the 
polyadic group G,-: of the (u—1)-ads of G there will then be the polyadic 
group G,/_, of the (u—1)-ads of G’, conveniently set up by means of the con- 
taining group of G’ generated by the substitutions of G’. G/_; then consists of 
substitutions carrying each of the u—1 I’s on which G’ is written into them- 
selves(*'). Since G’ is transitive, at least one substitution of G,'_1 carries au 


(%) E.g., see the end of the last footnote to §7. Likewise the concept of semi-invariant 
subgroups could correspondingly be generalized. __ 

(*) Note that these will also be the substitutions forming Gj when and only when the con- 
taining group generated by G’ is of index u—1. 


1940] POLYADIC GROUPS 329 


into itself. The set of all such substitutions in G,'_; then constitutes a subgroup 
Hi of GJ_, of order g/n. The associated ordinary group Hj of H,/_; is a sub- 
group of the associated ordinary group G/ of G’, and, in fact, consists of the 
substitutions of Gj carrying ay into itself. It then follows from the transitiv- 
ity of G’ that neither Hj, if it be not the identity, nor any subgroup of H/ 
other than the identity is invariant under G’. 

It therefore follows that for G to be representable by a transitive (m, u) 
group of degree n, u—1 a divisor of m—1, it is necessary that G,_; have a sub- 
group H,_: of order g/n such that neither Ho, that is, (H,_1)o, if it be not the 
identity, nor any subgroup of Hy other than the identity is invariant under G. 
We now prove this condition also sufficient. Each right coset of G* as regards 
H) consists of g/n i-ads with fixed 7. H,*, consists of (m—1)/(u—1) of these 
cosets, one for each i a multiple of u.—1. Each right coset of G* as regards 
H;*, therefore also consists of (m—1)/(u—1) of the right cosets of G* as re- 
gards Ho, one for each i differing from a fixed i=) by a multiple of u—1. We 
may then choose ip so that 1 Si9 Su—1. And for each such 4p there will be ex- 
actly ” right cosets of G* as regards H,*., which together exhaust all i-ads with 
4—io a multiple of ~—1. Now symbolize the m right cosets of G* as regards 
H with by the letters au, ai, , din. These together will form the 
IT’; of the basis of our representation. Similarly for T2, - - - , !',-1, with zo corre- 
spondingly 2,---,u—1. If now we multiply the elements of G* on the right 
by an element s of G, the effect on the right cosets of G* as regards H,*, is 
merely to permute them as units, the io of such a coset becoming io9+1, re- 
duced modulo u—1 if need be. In terms of the a’s therefore, the letters of T; 
go over in 1-1 fashion into those of T2, of I’; into those of T'3,---, of Ty4 
into those of I';. Corresponding to s there is thus determined a y-adic substitu- 
tion s’ of degree on the letters of T,, T2, - - - , ',-1. The set of all such p-adic 
substitutions corresponding to elements of G clearly constitute an m-group 
G’, under the product of m substitutions as operation, isomorphic with G. 
This isomorphism is also simple. For if s; and sz are any two elements of G 
corresponding to the same substitution s’ of G’, t=s,sz! must be both in Gp 
and H,*.,, and hence in Ho. The set of such ¢’s must then be a group contained 
in Ho, and invariant under G, and hence consists of the identity only. That 
iS, 51=Se. 

We have thus proved the following theorem. A necessary and sufficient con- 
dition that an abstract m-group G of order g can be represented as a transitive 
m-adic group of u-adic substitutions of degree n, u—1 a divisor of m—1, is that 
the polyadic group of (u—1)-ads of G contains a subgroup of order g/n whose 
associated ordinary group, if not the identity, is not invariant under G, and con- 
tains no subgroup besides the identity invariant under G. For the representa- 
tion of G by a transitive m-adic substitution group of degree m this condition 
reduces to the condition that the associated ordinary group of G contains a 
subgroup of order g/n which, if not the identity, is not invariant under G, and 


330 E. L. POST } [September 


contains no subgroup besides the identity invariant under G, while for the rep- 
resentation of G by a transitive m-group of ordinary substitutions the con- 
dition becomes G contains a subgroup of order g/n whose associated ordinary 
group has the above property. 

When g=7 the non-invariantive property is vacuously satisfied. Hence a 
necessary and sufficient condition that an abstract m-group G can be represented 
by a regular m-adic group of y-adic substitutions is that the polyadic group of 
(u—1)-ads of G possesses a first order element. When =m this leads again, 
through the identity of Go, to the universal representability of abstract m- 
groups as regular m-adic substitution groups. On the other hand, for the rep- 
resentation of G as a regular m-adic group of ordinary substitutions, it is 
necessary and sufficient that G possess a first order element. In particular, 
every abstract m-group of order prime to m—1 can be so represented. 


C. FINITE m-ADIC LINEAR GROUPS 


35. m-adic linear transformations. An ordinary transformation in m varia- 
bles may be thought of as transforming an m-dimensional space > into itself. 
By analogy with m-adic substitutions, an m-adic transformation in m varia- 
bles will then transform m—1 spaces 2’, 2’’,---, 2», of m dimensions 
each, cyclically into each other, i.e., -- TVD’. 
In particular, if xi, x2, - - + , xin are the old coordinates in 2“, and x4, xg, 

Xm the new, an m-adic linear transformation of 
will consist of m—1 sets of linear homogeneous equations of the form 


(i) (i) (i) 


where i=1, 2,---,m—1, the 1+1 in the last case being replaced by 1, and 
where for each i the determinant of the n? coefficients is not zero. 

As in the case of m-adic substitutions, we shall assume for simplicity that 
the spaces 2’, 2’’,---, Z-” are mutually exclusive. The m-adic linear 
transformation A may then be considered to be an ordinary linear transfor- 
mation of the (m—1)n variables x1, ---, X(m—1yn, but of the above special 
form(*). The product of m such linear transformations will again be a linear 
transformation of the same form, and hence serves to define an m-adic 
operation on m-adic linear transformations of 2’, 2’’,---, 2”. It then 
readily follows that the class of all m-adic linear transformations of 

(*) The above requirement that each of the m—1 separate determinants be different from 


zero is equivalent to this ordinary linear transformation’s being nonsingular. See the end of the 
present section. 


A; 


1940] POLYADIC GROUPS 331 


y’, 2’,--+, Z-» with complex coefficients form an m-group under this 
operation. For the associative law follows immediately from this reinterpreta- 
tion. Furthermore, if in the equation A1A2 - + - Am=Am4: all but A; are speci- 
fied m-adic linear transformations, A; will be determined as an ordinary linear 
transformation and be given by the equation A;=Ajz,--- Ar'AmsuA;' 

- ++ Aj. Now each A- carries 2; into Hence A; carries 2; into 2, 
where k=j—(m—1)+1 (mod m—1), i.e., k=j+1 (mod m—1), and A; is also 
an m-adic linear transformation. 

We shall call any set of m-adic linear transformations of 2’, 2’’, - - -, 2*-» 
which constitute an m-group under the above operation an m-adic linear 
group in n variables. Any such m-group will then be a subgroup of the above 
“complete” m-adic linear group in m variables. It follows that the necessary 
and sufficient condition that a finite set of m-adic linear transformations of 
2’, +--+, Z@-» with complex coefficients form an m-adic linear group 
is that the product of any m members of the set is in the set. Unless otherwise 
indicated, m-adic linear group will mean finite m-adic linear group in the pres- 
ent paper. However, the infinite complete m-adic linear group is useful in 
serving as fundamental m-group for operations on arbitrary m-adic linear 
transformations. Its members, as ordinary linear transformations in (m—1)n 
variables, will generate a containing group of index m—1 which may there- 
fore be used in place of its abstract containing group. Its ordinary associ- 
ated group, consisting of the products of m—1 m-adic linear transformations 
of 2’, 2’’,---, Z-», will therefore consist of transformations which carry 
each 2‘* into itself, and indeed of all linear transformations with complex 
coefficients which carry each 2“ into itself. We may therefore refer to such 
transformations as (m—1)-ads of m-adic linear transformations, or briefly 
(m—1)-ads. 

While it will continue to be useful every so often to consider m-adic linear 
transformations as special forms of ordinary linear transformations, it is as 
generalization of ordinary linear transformation that they lend themselves 
to a corresponding generalization of the ordinary theory. For this purpose we 
return to our arbitrary m-adic linear transformation A, and as in ordinary 
theory represent it by the m-adic matrix 


A = [A’,A",---, AC], 
where the component A“ is the ordinary matrix 


(4) (4) (4) 
1 G2 °** Vn 
(4) (4) (4) 
Ga, Ge2 Gan 


(4) (4) 
mi Gn2 Onn 


332 E. L. POST j [September 


formed from the coefficients in the equations expressing each x;;, with fixed i, 
in terms of the x(i414’s. The product of m m-adic linear transformations 
A1, As, - ++ , Amis the m-adic linear transformation A obtained by performing 
these m transformations in succession. If then the corresponding m-adic mat- 
rices are A{’,---, , Am=[Anl, An’, A@-”], 
then by following through the m transformations, we find that the m-adic 
matrix A=[A’, A’’,---, A] is given by the following ordinary matrix 
equations 
(m—1) 


A= Am-1 Am, 


(4) (i) , (é+1) , (4) 
A = A; Ag 


(m—1) (m—1) (m—2) ,(m—1) 


These equations then completely determine an m-adic operation on m-adic 
matrices. We shall call A the product of A1, Az, ---, Am, and write simply 
A=A,\A;::-Am. From the correspondence between m-adic matrices and 
m-adic linear transformations we then have immediately that the set of 
all m-adic matrices with complex a’s and fixed n, forms an m-group under 
this m-adic operation. Hence also we have the group property criterion for a 
finite m-adic group of m-adic matrices. As in ordinary theory, we therefore 
reinterpret m-adic linear group as an m-adic group of m-adic matrices. 

We could correspondingly reinterpret the concrete containing group of 
the complete m-adic linear group. It suffices for our purpose merely to do 
so for the (m—1)-ads of the containing group. If a“ is the matrix of the 
coefficients in the transformation thus expressing the x;;’s in terms of the 
Xa's, we shall represent the (m—1)-ad a by the sequence of matrices 
a=(a’, a’’,---, a), The dyadic operation on matrix (m—1)-ads is 
then seen from the corresponding transformations to be 


” (m—1) ” (m—1) 

(m—1) (m—1) 


while the product of an (m—1)-ad and a monad, i.e., a single m-adic linear 
transformation, will be given by 


= [a’A’, aA", 
of a monad by an (m—1)-ad by 


. . . . . . . . . . . 


1930] POLYADIC GROUPS 333 


Clearly the identity among (m—1)-ads is (E, E,---, E), where EZ is the 
ordinary matrix identity, while the inverse of (a@’, a’’,:--, a) is 

We consider now the important question of change of variable. Let S 
be an m-adic linear transformation carrying the x;;’s into the *¢41).’s, T 
an m-adic linear transformation expressing the x;;’s in terms of X (4414's, and 
likewise the xj’s in terms of X(:41x’s. As a result, the X;;’s are carried 
into the X(i+1.’s according to an m-adic linear transformation R. We shall 
say that R is the result of m-adically changing variables in S according 
to T. Now with R, S, and T considered to be ordinary linear transforma- 
tions on (m—1)n variables, R is the result of an ordinary change of variables 
in S according to T, and hence is the transform of S with respect to T. If 
then in the equation R= T-!ST we follow through the successive linear trans- 
formations, we obtain the following results on the corresponding m-adic mat- 
rices. If 


S = [S’,S",---, S*-»], T= T-], 


then the transform 
R= [R’,R”,---, 


of S with respect to T, which is the result of m-adically changing the variables 
of S according to T, is given by the equations 


R® = [THY i=1,2,---,m—1 (%), 


Closer to the ordinary concept of change of variable would be instituting 
an ordinary change of variable in each space 2“. This would then corre- 
spond to changing variables according to an (m—1)-ad. As before, if S is an 
m-adic linear transformation, tr equivalent to an (m—1)-ad of m-adic linear 
transformations, the result of changing variables in S according to 7 will be 
an m-adic linear transformation R with R=7~!Sr. The corresponding formula 
for transforming the m-adic matrix S = [S’, S’’, - - -, S“-» ], by the (m—1)-ad 
++, to yield the m-adic matrix R=[R’, R’’,---, R&-»] 
may again be obtained by following through the transformations involved, or, 
perhaps just as easily, by applying our formulas for operations on (m—1)-ads. 
We thus obtain 


R® = 
While our m-adic matrix notation is more convenient in most applications, 


our later generalization of characteristic equation requires rather the matrix 
of the corresponding ordinary linear transformation in the (m—1)mn variables. 


(*) These equations can also be obtained from the equations defining the m-adic operation 
on m-adic matrices, and the original m-adic definition of transform. 


334 E. L. POST , [September 


With A=[A’, A”, ---, the corresponding ordinary matrix then has 
the following form 


L Aim) Q 


If then D is the determinant of this matrix, D’, --- , D‘~- of the compo- 
nents A’,---, A» of A, it follows that 


D = (— 1)™*D’D" . Dim», 


By contrast, for the (m—1)-ad a=(a’, a’’,---, a), the corresponding 
ordinary matrix has the components of a along its principal diagonal, zero’s 
elsewhere, and the determinant of the matrix is always the product of the 
determinants of the components. 

36. m-adic collineations and collineation-groups. If the variables of each 
space >‘ be considered homogeneous coordinates in a corresponding space 
S® of dimension n—1, our m-adic linear transformation A on 2’, 2’’, - - - 


may be said to define an m-adic collineation on S’, S’’,---, 
In fact, if we let the ratios xi1/xin, Xiqn—1)/Xin be denoted by ya,---, 


Yiun—y, we are thus led to the m-adic linear fractional transformation 
4=1,2,---,m-—1: 


i) + (i) (i) 


(i) (4) 


Vis = s=1,2,---,n—1. 

Unlike the case of an m-adic linear transformation, our m-adic linear frac- 

tional transformation is in general not a special case of an ordinary linear frac- 

tional transformation on all the variables, since the denominators in general _ 
are not all the same. On the other hand it justifies our phrase m-adic collinea- 

tion, since the equality of the denominators for each i insures our m-adic 

linear fractional transformation on the nonhomogeneous y’s carrying the 

straight lines of each S“ into those of S“+”. Moreover, the product of m 

m-adic linear fractional transformations of S’, S’’,---, S~» will again be 
of that form, so that we can expect to have m-adic linear fractional groups, 

and hence m-adic collineation-groups. 

Two m-adic linear transformations A; and Az on 2’, 2’’,---, 2» will 
yield the same m-adic linear fractional transformation on S’, S’’,-- +, S*-» 
when and only when their m-adic matrices Ai’,---, and 
A,=[Ai, Ad’, , are such that the elements of each component 
A® area constant k; times the elements of the corresponding component A{’. 


0 A’ 0 ---0 
0 0 A”.--0 
0 0 «++ A(m-2) 


1940] POLYADIC GROUPS 335 


This then is the condition that A; and Az represent the same m-adic collinea- 
tion. Since the k;’s need not be the same, A; and A; as ordinary linear trans- 
formations need not then represent the same collineation in the ordinary 
sense. If now we let 7 be the (m—1)-ad 


((A1, ki, ky), (ke, ke, he), (Rm—1; Rm—1y km—1)) 


whose components are all ordinary similarity-matrices, we see from the pre- 
ceding section that A,=7Az. We shall call an (m—1)-ad each of whose com- 
ponents is a similarity-matrix a similarity-(m—1)-ad. It follows that A; and 
Az represent the same m-adic collineation when and only when A,A;s" is a 
similarity-(m—1)-ad. 

Az"A, must then also be a similarity-(m—1)-ad; but it will equal A1Az* 
when and only when the k;’s are all equal. In fact, again by the preceding 
section, writing the above r=(r’, 7’’,---, we find that 
where 7’, ---, and hence Az!A,=7. Comparing these 
two results, we see that Az'7A2.=T. Since Az is an arbitrary m-adic matrix, 
it follows that every m-adic matrix. transforms a similarity-(m—1)-ad 
(r’, ++, into the similarity-(m—1)-ad 


By contrast, every similarity-(m—1)-ad is transformed into itself by an 
(m—1)-ad. 

Consider now any m-adic linear group G. Since the product of two similar- 
ity-(m—1)-ads is again a similarity-(m—1)-ad, the similarity-(m—1)-ads of 
Go, the associated ordinary group of G, will constitute a subgroup Ho of Go. 
Since every m-adic matrix transforms a similarity-(m—1)-ad into a similar- 
ity-(m—1)-ad, Ho will be invariant under G. We may therefore form the 
m-adic quotient group K = G/Hp. Each coset of G as regards Hy can be written 
HA with A in G, and hence consists of elements of G representing the same 
m-adic collineation as A, and, in fact, of all such elements of G. The elements 
of K are thus in 1-1 correspondence with the distinct m-adic collineations 
represented by the elements of G. K may therefore be called the m-adic col- 
lineation-group corresponding to G. 

An arbitrary m-adic collineation-group G may be given by arbitrarily rep- 
resenting each collineation by an m-adic linear transformation(™). If G is of 
order g, and written thus “on m variables,” a modification of the ordinary 
treatment will yield an m-adic linear group of order n™~'g which is (n™—", 1) 
isomorphic with G, and whose transformations have components of determi- 
nant unity. In fact let S=[S’, S’’, --- , S-”] be in G thus represented, with 
the determinants of its components D’, D’’, - - - , D‘™—” respectively. Let 0 
be any solution of the equation [6 ]»=[D]-1, and form the similarity- 


(*) The product of m such representatives need not then be in the given set of representa- 
tives, but need merely represent the same m-adic collineation as some member of the set. 


336 E. L. POST ; [September 


(m—1)-ad r=((0’, 0’, ---, 8), (0,0, +, 0"), 
Then A =rS= [(0’, 8’, - -,0’)S’, 07, +, 0) +, 
] represents the same m-adic collineation as S,and has 
all of its components of determinant unity. For each S there will thus be ™—! 
A’s, and these constitute all of the m-adic linear transformations with com- 
ponents of determinant unity representing the same m-adic collineation as S. 
It then readily follows that the set of m™~1g m-adic linear transformations 
thus corresponding to the g elements of G constitute a linear m-group iso- 
morphic with G. For let S;, Sz, - - - , Sn be any m transformations in the origi- 
nal representation of G, A1=715S:1, A2=T2S2, ++, Am=TmSm Corresponding 
transformations with components of determinant unity. Then 


A = 


has for its ith component 


(i) , (+1) (4) (i) (+1) (8) 41) 
A =A, Ag 


= 71 ++ tm Se =r S ’ 


where S=S,S2 +++ Sm, and 7 is a similarity-(m—1)-ad. A therefore has com- 
ponents of determinant unity, and represents the same m-adic collineation as 
S.A is therefore in our set of n™~'g transformations, whence finally our result. 

To compare the ordinary treatment with this modification of it, we intro- 
duce the following considerations. Given an m-adic linear group G, those 
similarity-(m—1)-ads of Go which have equal components themselves con- 
stitute a subgroup Hy of Gp invariant under G. We may therefore form the 
m-adic quotient group K’=G/H¢ . Each coset of the expansion of G as regards 
H{é consists of all transformations in G which as ordinary transformations on 
(m—1)n variables correspond to the same ordinary collineation. We shall 
therefore call K’ the collineation-m-adic group of G. If now an arbitrary col- 
lineation-m-adic group G be given by corresponding representative m-adic 
linear transformations, the ordinary treatment applies without modification ; 
and if G is of order g, and on m variables, a linear m-adic group of order 
(m—1)ng is thus obtained which is [(m—1)n, 1] isomorphic with G, and 
whose members as ordinary transformations are of determinant unity. On the 
other hand, if an arbitrary m-adic collineation-group G be thus given, the 
ordinary unmodified treatment will in general be inapplicable. In fact, other- 
wise, the given representatives of the members of G must also be representa- 
tives of the members of a collineation-m-adic group. This will clearly not be 
so for random representations of the members of G. And the following ex- 
ample shows that the m-adic collineation-group G may be such that no repre- 
sentation thereof will represent a collineation-m-adic group. The triadic 
collineations corresponding to 

-1 0 


A: [(1,1), (1, -1)], 3 


1 0/’ 


1940] POLYADIC GROUPS 337 


generate a triadic collineation-group G of order 4. The most arbitrary repre- 
sentations of A and B are 


By direct computation we find that AAA=[(a%b, —a%b), (ab, ab?)], 
BBA =[(—acd, acd), (bed, bed) |. As triadic collineations, AAA and BBA 
are identical, being the same as [(1, —1), (1, 1)]. As ordinary collineations, 
they can but be identified with [(a, —a), (6, b)], [(—a, a), (b, b)] which are 
never the same. Since any representation of G can have but one triadic linear 
transformation for each triadic collineation in G, no representation of this 
triadic collineation-group can also represent a collineation-triadic group. 

If however G itself is an m-adic linear group, both methods are applicable. 
The unmodified treatment will then yield an m-adic linear group which is 
[(m—1)n, 1] isomorphic with the collineation-m-adic group of G, and whose 
members as ordinary linear transformations have determinants unity. On 
the other hand, our modified treatment yields an m-adic linear group which 
is (n™—'!, 1) isomorphic with the m-adic collineation-group of G, and whose 
members have components of determinant unity. 

37. m-adic Hermitian invariants. A set of m—1 positive-definite Hermi- 
tian forms J=[J’, --- , one for each space 2“, will be said to 
be an m-adic (positive-definite) Hermitian form. Now 


n n 
(i) @)  _@) 
= DD Fir, = Qe, 


keel lel 


can be transformed into 
IO = + + Vindin 
by a change of variables of the form 
Vit = Pki 
Hence J=[J’, J’’, - - -,J‘- ] can be transformed into I=[I',I'’, ---,I@-»] 
by changing variables in , according to an (m—1)-ad whose 
components, withi=1,2,---,m—1, are of the above form. The (m—1)-ad, of 
course, is that obtained by solving for the x’s in terms of the y’s. It is further 
understood that in operating on J by this (m—1)-ad, if x;; is replaced by a 
certain expression, £;; is replaced by the conjugate of that expression. 

If, on the other hand, J is transformed according to an m-adic change of 
variables, J‘, written on the variables of 2‘, becomes an expression in 
the new variables not of 2“ but of =“+”. We are thus led to define an m-adic 
Hermitian invariant of an m-adic linear group as an m-adic Hermitian form 


338 E. L. POST [September 


J=[J', J’, +++, J» ] such that each transformation in the group carries 
JU +, J’, It then readily follows that every m-adic 
linear group G has an m-adic Hermitian invariant. For let Gj be the 2’ con- 
stituent group of Go, the complete analogue of the Gj of an m-adic substitu- 
tion group. The linear group G7 then has an Hermitian invariant J’ on the 
variables of 2’. Let S be in G, and let J’’ be the result of transforming J’ 
according to S,---, J” of transforming J‘~-® according to S. Then 
J=([J', J", +++, J@-»] will be an m-adic Hermitian form on 2’, 
2‘, and, as in §39 to come, is seen to be an m-adic Hermitian invariant 
of G. 

By combining the above two results it follows that the variables of an 
m-adic linear group G may be so changed according to an (m—1)-ad that 
Hermitian invariant of the resulting transform of G. 

An m-adic linear group G in variables will be said to be linearly reduci- 
ble(*) if by a suitable change of variables according to an (m—1)-ad there will 
be in , subspaces(*) Ti’, --- , respectively on 
v<n variables each such that 2{—2//— -- - under every 
transformation in the resulting transform of G. If for some such change of 
variables the subspaces - - - , on the remaining variables 
of 2’, 2’’,---, 2» are also each transformed into the next, then G will 
be said to be intransitive. In the first case Z{, Di’, -- - , D{"~” will be said 
to be a reduced set for G, in the second case a set of intransitivity of G. We then 
prove the theorem a linearly reducible m-adic linear group G is intransitive, 
and a reduced set constitutes one of the sets of intransitivity of G, subject, of 
course, to a change of variables in the reduced set according to an (m—1)-ad 
thereon. We may assume the variables in the reduced set to be the first v 
variables of each 2‘, Then G may be further transformed by an (m—1)-ad 
so that it will have the m-adic Hermitian invariant J above. And this further 
change of variables, according to the form given above, merely transforms the 
reduced set according to an (m—1)-ad on its variables. With G in this last 
form, consider its containing group G*. Then --- wiil 
be an ordinary Hermitian invariant of the ordinary linear group G*, while the 
(m—1)v variables constituting the reduced set for G form a reduced set for G* 
without further transformation. But then G* is in intransitive form with those 
(m—1)v variables constituting a set of intransitivity of G*. The same is then. 
true of G. 

An m-adic matrix A =[A’, A’’, - - - , A‘ ] will be said to be in canonical 
form if each component is in the canonical form (a{?, af, ---, a®). 
Then the corresponding ordinary theorem generalizes, i.e., if A is of finite 

(*) To distinguish between this extension of the ordinary concept and the totally un- 


related polyadic concept we have termed reducibility. 
(*) Strictly, a misnomer, but a convenient one. 


1940] POLYADIC GROUPS 339 


m-adic order, then it can be reduced to canonical form by transformation by an 
(m—1)-ad(*"). We shall prove this result in the next section more expedi- 
tiously. However we here give the analogue of the ordinary proof for the sake 
of the concepts thus introduced. 
We prove then that we can always find m—1 linear functions 
= xis + bs x52 + xin, i= 1, 2, 1, 
such that each yj is transformed into a constant 0; times y (i411 by A. These 


m—1 functions may then be said to constitute a relative m-adic invariant of A. 
With A the transformation 


n 
(1) 
tie = >> X 


we find that (yi)A =90:y i411 provided the following equations are true: 


n 
(i+1) 
6b: = , 
e=1 


By successive substitution, with i=1, 2,---,m—1, we obtain from these 
equations 


s=1 
where the ordinary matrix =A)»=A’A” - - A set of solutions 
bi, bf,---, 6,2, notall zero, of this last set of equations can always be found 
provided 6102 - - - 0n_1 is a root of the characteristic equation of Ao. The pre- 
ceding equations, with i=1,2,--- , m—2, then determine the remaining }’s, 
while the equations for i=m—1 are then automatically satisfied. 

Having thus found a relative m-adic invariant of A, the remainder of the 
proof follows the lines of the standard proof. That is, by a change of variables 
according to an (m—1)-ad given in part by our relative m-adic invariant 
of A, the new variables yu, yo, -°~-, Yqm—11 are transformed according to 
the equations yu (/411, 7=1, 2,---, m—1, and hence constitute a re- 
duced set for the m-adic linear group generated by A. If then A isof finite 
m-adic order, further change of variables according to an (m—1)-ad will 


(*7) It might be thought that since A as ordinary linear transformation is then of finite 
ordinary order, the standard theorem would apply. But note that an m-adic matrix in canonical 
form is not in canonical form as ordinary matrix. And from the contrary point of view, while A 
as ordinary matrix could thus be reduced to ordinary canonical form, the resulting linear trans- 
formation would no longer be an m-adic linear transformation; and the transformation used to 
obtain it would be a linear transformation on all the (m—1)mn variables in a form constituting a 
meaningless jumble from the point of view of m-adic linear transformations. 

Or, more expeditiously, from = 0102 * 


340 E. L. POST [September 


change yu, Ya, °° * » ¥qm—1y1 into a set of intransitivity of the group generated 
by A. A then determines an m-adic linear transformation on the remaining 
n—1 variables, and the process may be repeated until A appears in canonical 
form, and, indeed, as the result of a single change of its original variables ac- 
cording to an (m—1)-ad. 

Our proof of the existence of relative m-adic invariants of A might have 
taken a different turn. Our original (m—1)n homogeneous linear equations 
in the (m—1)m undetermined }’s will have a set of solutions not all zero, and 
hence, as shown by the equations themselves, not all zero for any i, provided 
the determinant of their coefficients is zero. We are thus led to one equation 
in the m—1 unknowns 4, 62, - - - , @m—1 which may be called the m-adic char- 
acteristic equation of A. Its right-hand member is zero; left, the determinant of 
A as ordinary linear transformation with the elements of the principal diago- 
nal, all zero in A, replaced by —Om1, —01,°°°, 
—Om—2,°**, —Om—2. With --- the m-adic characteristic 
equation of A becomes the ordinary characteristic equation of A as ordinary 
linear transformation. We are thus, in fact, assured of relative m-adic invari- 
ants of A with 6’s all equal. However, comparison with the earlier treatment 
yields the following result. The solutions of the m-adic characteristic equation 
of A=[A’, A’’,---,A™-»] consist of all sets of values 61, 62, , for 
which 6,62 - @m—1 is a root of the characteristic equation of Ay=A’A”’ - 

38. Reduction to canonical form. If for two m-adic linear transformations 
A and B in n variables there is a third C such that B=C-'!AC, then A and B 
will be said to be conjugate. This is equivalent to there being an (m—1)-ad 
¥y such that B=y~!Ay, since C and A™~*C on the one hand, y and Ay on the 
other, yield the same transform of A. It follows that the relation “A and B 
are conjugate” is an equivalence relation. Likewise for m-adic linear groups. 

The following easily proved theorem reduces the problem of conjugate 
m-adic linear transformations in m variables to that of conjugate ordinary 
linear transformations in m variables. The necessary and sufficient condition 
that A=[A’, A",---, and B=[B’, B’’,---, are conjugate 
is that and - - are conjugate. In fact, 
if B=y~'Ay, y=(7', °°, y™-»), then by our formula for change of 
variables according to an (m—1)-ad 


Hence 


whence the necessity of our condition. Conversely, if Ao and Bo are con- 
jugate, y’ may be chosen to satisfy the last of the above equations. If then 


yy", +++,” are determined in accordance with the first m—2 of the 


1940] POLYADIC GROUPS 341 


change of variable equations, the last of those equations will be automati- 
cally satisfied. An (m—1)-ad y=(y’, y’’,---, y~») is thus determined 
which transforms A into B. 

This result contrasts strongly with the corresponding result for (m—1)- 
ads. We may define two (m—1)-ads a and 6 to be conjugate if there is an 
(m—1)-ad y such that B=y~'ay. From our formula for the product of two 
(m—1)-ads it follows thata=(a’,a’’,---,a-) 
are conjugate when and only when the corresponding components a‘ and 8B‘ 
are conjugate for each 7. Hence, while the question of conjugacy for an m-adic 
matrix in m variables depends on but one ordinary matrix in m variables, the 
same question for an (m—1)-ad depends on m—1 independent ordinary mat- 
rices in » variables each. Intrinsically, therefore, an m-adic matrix is far 
simpler than an (m—1)-ad. This is rather surprising in that apart from change 
of variables they are of equal generality; for if A is a fixed m-adic matrix the 
relation S=7A gives a 1-1 correspondence between all m-adic matrices S and 
(m—1)-ads r. 

A more symmetrical though less useful condition for the m-adic matrices 
A and B being conjugate is that the (m—1)-ads A™~! and B”~"' are conjugate. 
In fact, if A"-'=a, the equation A"=aA yields 


The first component of A”~! is therefore the Ao of our previous condition, 
while all the components are conjugate. The present condition then follows. 
We may note that all the components of an (m— 1)-ad being conjugate is suffi- 
cient as well as necessary for the (m—1)-ad being the (m—1)-st ordinary 
power of some m-adic matrix. Intrinsically, then, an m-adic matrix is of the 
same degree of generality as an (m—1)-ad with conjugate components. Too 
much emphasis, however, must not be placed on the forms assumed by a 
single element under transformation, our present concern. 

Returning to our first condition for the conjugacy of m-adic matrices, 
we have immediately that A=[A’, A’’,---, A] is conjugate to 
[Ao, E,---, El, with Ap=A’A”--- A“, If now A is of finite m-adic 
order, then A”~!, and hence its first component Ap, is of finite order. Ao is 
then conjugate to a matrix in the canonical form (a1, d2,---, @,). Hence, 
if A is of finite m-adic order, it is conjugate to an m-adic matrix in the canonical 
form 

More generally, if A is of finite m-adic order, it is conjugate to those 
m-adic matrices in the canonical form [(a/,a/,---,@0),(ai',a¢’,---,ax’), 
a permutation of a1, dz, - , dn. Since a1, dz, , @, are the roots of the char- 
acteristic equation of Ao, we may say, as a consequence of the last section, 
that an m-adic matrix A of finite order assumes those canonical forms for 
which each selection of corresponding elements chosen from its components 


342 E. L. POST [September 


constitutes a solution of the m-adic characteristic equation of A, while the 
corresponding roots of the characteristic equation of Ao are all of its roots 
each with the correct multiplicity. In particular, we may make a/ =a{’ 
=--- =a!"~” for each i. Hence the useful special result if A is of finite 
m-adic order, it is conjugate to an m-adic matrix in canonical form having equal 
components. 

The most satisfactory generalization of an ordinary similarity-matrix is 
our similarity-(m—1)-ad. An m-adic matrix each of whose components is a 
similarity-matrix will not in general remain of that form under transforma- 
tion by an m-adic matrix(**). We therefore define an m-adic similarity-matrix 
as one which is conjugate to an m-adic matrix whose components are all 
similarity-matrices. It readily follows from our criterion for the conjugacy of 
m-adic matrices that A=[A’,A’’,---,A™-» ] is an m-adic similarity-matrix 
when and only when A'A"’--+ A» is a similarity-matrix. In particular, 
every first order m-adic matrix is an m-adic similarity-matrix. In fact, A is 
of m-adic order one when and only when A’A”’--- A‘"-) =E, Hence the 
first order m-adic matrices are the conjugates of [E, E,---, E]. 

Our chief reason for introducing the above concept is the following theo- 
rem. If an m-adic linear group has an m-adic similarity-matrix as invariant 
element, it is conjugate to a group in which each element is an m-adic matrix 
with equal components. By an m-adic change of variable the invariant similar- 
ity-matrix can be transformed into an m-adic matrix A in canonical form in 
which the components are now equal similarity-matrices. If the given group 
is correspondingly transformed, a conjugate group having A as invariant ele- 
ment is obtained. For each element B of the transformed group we thus have 
ABA =B, 


Bw = [A (1) 4 i=1,2,---,m—1. 


Since A‘ and A‘ are the same similarity matrices, we thus have B‘ 
= B‘-) fori=1, whence our theorem. 

An m-adic linear group which is reducible to a 2-group automatically sat- 
isfies the condition of this theorem via its invariant first order element. An 
interesting property of any m-adic linear group thus conjugate to an “equi- 
component” group is that its m-adic collineation-group is identical with its 
collineation-m-adic group. In fact, in the case of an equi-component group 
itself, the associated ordinary group consists of (m—1)-ads with equal com... 


() Nevertheless, the set of such m-adic matrices of an m-adic linear group do constitute 
a subgroup, if existent, though in general not an invariant subgroup, of the group—likewise, 
those of these matrices having equal components. On the other hand, the subset of m-adic 
similarity matrices, in the sense about to be defined, while constituting an invariant subset 
of the m-adic linear group by their very definition, do not in general constitute a subgroup 
thereof. They do, however, when existent, separate into a number of semi-invariant subgroups 
with the subgroup of similarity-(m—1)-ads as common associated group. 


1940] POLYADIC GROUPS 343 


ponents, and hence has no other similarity-(m—1)-ads than those with equal 
components; while under transformation by an (m—1)-ad the similarity- 
(m—1)-ads are unchanged. An equi-component group clearly has the follow- 
ing two properties: (a), it is simply isomorphic with a group of ordinary mat- 
rices in the specified number of variables, (b), no two distinct elements of 
the group have a pair of corresponding components the same. Now these 
properties are invariant for transformation by an (m—1)-ad; (a), by its very 
formulation, (b), by our formulas for transformation by an (m—1)-ad. Hence 
they are satisfied by all groups conjugate to equi-component groups. The 
class of groups satisfying condition (a), as well as the class of groups satisfy- 
ing condition (b), are therefore each at least as wide as the class of groups 
conjugate to equi-component groups. Actually each of the first two classes 
is wider than the third, for the following examples show that neither of the 
first two contains the other(!%). Let Go be the axial group with elements 
((1, 1), (—1, —1)), ((—1, —1), (1, 1)), ((-1, —1), (—1, —1)), ((1, 1), (1, 1)); 
So=[(1, 1), (1, 1) ]. Then in terms of the present operations the conditions of 
the construction theorem of §8 are satisfied, and G=GpSyp is a triadic linear 
group in two variables. Now let Gp be the axial group with elements (1, —1), 
1), (—1, —1), (1, 1); 


Then G=G)Sp is a 3-group of ordinary matrices in two variables. With ele- 
ments of Gp and G) corresponding in order, So corresponding to So, the condi- 
tions of the simple isomorphism theorem of §8 are satisfied, so that G is 
simply isomorphic with G. Hence G satisfies condition (a), but clearly fails to 
satisfy condition (b), since condition (b) is equivalent to the same condition 
stated for Go. For our second example we consider the rather trivial case n = 1. 
With G, the cyclic group whose elements are ((i), (—7)), ((—1), (-—1)), 
((—4), (4)), ((1), (1)), and So=[(1), (1)], G=GoSo is a triadic linear group 
in one variable satisfying condition (b). But it cannot satisfy condition (a); 
for it is non-abelian, while any polyadic group of ordinary matrices in one 
variable is readily seen to be abelian. 

We conclude this section with a proof of the following generalization of 
the corresponding ordinary theorem. Any abelian m-adic linear group is con- 
jugate to a group each of whose elements is in canonical form with equal compo- 
nents. We first prove this result for the case of an abelian group G having an 
m-adic similarity-matrix A. By the proof of the theorem preceding the above 
digression, G is conjugate to an equi-component group G in which 4, the 
correspondent of A, has for its components equal similarity-matrices. Now 
the constituent G/ of the associated ordinary group Go of G will be an ordi- 


(1) Clearly these distinctions constitute but a first glance at a probably wide theory. 


344 E. L. POST [September 


nary abelian linear group, and hence can be transformed by an ordinary ma- 
trix a’ so that each of its elements appears in canonical form. Since Gp will 
consist of (m— 1)-ads with equal components, the (m—1)-ada=(a’,a’, - + -,a’) 
will transform Gy into a group in which each element appears with equal com- 
ponents in canonical form. As @ transforms A into itself, it will therefore 
transform G=G)A into the conjugate of G of our theorem. 

Now let G be an arbitrary abelian m-adic linear group, A some fixed ele- 
ment thereof. By a previous result, we may assume the group to have been 
so transformed by an (m—1)-ad that A appears in canonical form with equal 
components A’. The (m—1)-ad A”! then has the equal components A’"—', 
also in canonical form. It follows from the invariance of any element 
B=([B’, B’,---, of G under A™“ that 


A'm-1BW = Bi 


for each i. If then we separate the variables of each space 2“ into sets 
--- , according to their distinct multipliers in A’"—', the proof 
of the corresponding ordinary theorem shows that B® transforms the varia- 
bles of each 2! into those of Z{*”. Each element B of G therefore transforms 
- 2}. That is, G appears in intransitive form with 
the / sets of intransitivity corresponding to j=1, 2, - - - , 1. Now for each set 
of intransitivity the corresponding partial transformations constitute any 
abelian m-adic linear group. Moreover, the corresponding partial transforma- 
tion of A is an m-adic similarity-matrix, since the corresponding partial 
transformation of A’"—' has but one distinct multiplier. Hence, by our spe- 
cial result, each of these constituent groups can be thrown into the desired 
form by transformation by an (m—1)-ad on the corresponding set of in- 
transitivity. Together, these ]/ partial (m—1)-ads constitute an (m—1)-ad 
on 2’, 2’’,---, 2-» which transforms G into the conjugate group of our 
theorem. 

Clearly, every m-adic linear group, each of whose elements is in canonical 
form with equal components, is abelian. On the other hand, unlike the ordi- 
nary case, an m-adic linear group each of whose elements is in canonical form 
need not be abelian. It is readily proved that the necessary and sufficient con- 
dition that such a group be abelian is that its associated ordinary group con- 
sist of elements with equal components. 

39. m-adic invariants. In the theory of ordinary linear groups in m varia-_ 
bles the concept of a function of those variables precedes that of an invariant. 
In our theory of m-adic linear groups G in n variables it is therefore natural to 
replace the concept of a function by a set of m—1 functions, one for each of 
the spaces 2’, 2’’,---, Z‘-”. If we transform such a set of functions 
X(m-1)n) | byan m-adiclinear transformation T of 2’, - each func- 


1940] POLYADIC GROUPS 345 


We therefore define f=[f’, f’’,---, f‘-»] to be an (absolute) m-adic in- 
variant of T if T transforms f’—f"’, f’’—f'"’, -- - , f-»—3f"; of G, if f is an 
m-adic invariant of each element of G. Actually, the following analysis shows 
this definition to be too narrow for a real generalization of the ordinary con- 
cept. But how to widen it without destroying our basic concept of m—1 
spaces 2’’,---, we do not at present know. 

Our chief result involves the associated constituent groups Gj, Gi’, ---, 
G&"- of G already introduced in §37 as the complete analogues of the corre- 
sponding concepts for m-adic substitution groups. More specifically, we saw 
that if Gis an m-adic linear group of m-adic matrices T= [T7’, T’’, - - -, T*-» l, 
Go, the associated ordinary group of G, may be concretely: given by a group 
of (m—1)-ads r=(r’, -- - , For each 7, 7“ represents a transfor- 
mation of the space 2‘* into itself; and the set of r‘®’s constitute an ordinary 
group, the associated constituent group Gi above. It is then fundamental 
that, as in the case of m-adic substitution groups, the associated constitu- 
ent groups of G are conjugate, each element T of G in fact transforming 
Gi Gh", - To verify this fact we need only ob- 
serve that T transforms Gp into itself; while if we follow through the opera- 
tions involved in T-'rT, we see that the ith component of the resulting 
(m—1)-ad is the transform of the (t—1)-st component of r by JT‘, 

Now let f= [f’, f’’, ] be an m-adic invariant of G; that is, each 
element of G transforms f’—>f’’, f’’—f’"’, -- - , Each element 7 of 
Go may be written as the product 7172: -- Tn_1 of m—1 elements of G. By 
following through these m—1 transformations we see that 7 transforms f’ 
into itself. But r can operate on f’ only through its first constituent 7’. Hence 
each 7’ transforms f’ into itself, and f’ is an ordinary invariant of the associ- 
ated constituent group G¢. 

Conversely, let f’ be any invariant of Gj, T> some element of G. T» will 
transform f’, a function of the variables of 2’, into a function of the va- 
riables of 2’’. Call this function f’’, ie., f’’=(f’)To. Likewise write 
an invariant of Gj, it will actually be transformed into itself by each ele- 
ment of Go, and hence by the (m—1)-ad Tg'~'. That is (f"-)T)=f’, and 
f=Uf',f",--+,f»] is an m-adic invariant of T>. We now show that it is 
also an m-adic invariant of every element T of G, that is, of G. Since Gj’ is the 
transform of Gj under 7», it follows that if r’’ is any element of G¢’ , then for 
some element 7’ of (f’’)r’’ =(f’) To = (f')r'To=(f') To=f’’. Hence, 
f’’ is an invariant of Gé’, and likewise f’’’ of Gj’’,---, of 
Each element r=(r’, 7’’,-- +, of Go will therefore transform each 
function f’, f’’,- - + ,f‘"~» into itself. Hence, by writing an arbitrary element 
T of G in the form rT», with 7 in Go, we see that 7, along with 7», will trans- 

We have thus proved the following theorem. Given an m-adic linear group 


346 E. L. POST : [September 


G with first associated constituent group Gé, then every m-adic invariant 
f=Uf',f",-++,f"] of G is such that f' is an ordinary invariant of Gé; 
and, conversely, every ordinary invariant f' of Gj yields an m-adic invariant 
f=([f,f",---,f"-] of G. Clearly, this correspondence between m-adic in- 
variants of G and ordinary invariants of Gj is 1-1. A like correspondence of 
course exists between the m-adic invariants of G and the ordinary invariants 
of Gi for any i. 

The weakness of our concept of m-adic invariants, already apparent 
from this reduction to ordinary invariants, is conclusively demonstrated 
by a consideration of invariants as group determiners. While the groups 
in question will in general be infinite, no part of the above discussion in- 
volves the hypothesis of finiteness in a linear group. Suppose then that 
f=Uf,f",---,f£-»] is an m-adic invariant of at least one m-adic linear 
transformation 7°, and let G be the set of all m-adic linear transformations 
with f as m-adic invariant. It is then readily verified that G is an m-adic linear 
group. By the proof of the above theorem, f’ is an invariant of G¢ , and, like- 
wise, f’’ of , of If then 7’, 7’, -- - , is any selec- 
tion from Gj, Gé',---, and r=(r’, 7’, 7), then 
has f for m-adic invariant. T is therefore in G, and hence 7 in Go. That is, 
the m-adic linear group defined by a given m-adic invariant is of that special 
kind in which the associated ordinary group consists of all selections, written 
as (m—1)-ads, that can be made from the associated constituent groups. 

When the above definition is extended to relative m-adic invariant, en- 
tirely corresponding results obtain. However, by a device similar to that 
which gave us our m-adic alternating groups, wecan enlarge somewhat the role 
of relative m-adic invariant as group determiner. f= [f’, f’’, -- - , will 
be a relative m-adic invariant of an m-adic linear transformation T if T trans- 
forms f so that f’ f', the x’s being constants 
depending on T. Each T having f as relative m-adic invariant thus determines 
a x-sequence. Furthermore, if 7:1, T2,---, Tm have f as relative m-adic in- 
variant, so also will T=7,7; - - - T,,; and the x-sequence of T is determined 
by the x-sequences of 71, 72, ---, Tm by the same equations that connected 
the 5-sequences of our alternating group theory. We are thus led to a com- 
plete m-adic x-group; and corresponding to any subgroup thereof, the set of 
all 7’s with x-sequences in that subgroup will be an m-adic linear group. 
Furthermore, whenever the associated ordinary group of the x-subgroup does 
not consist of all selections from its constituent associated subgroups, the 
corresponding m-adic linear group will also not be of this special type. How- 
ever, with the f‘®’s homogeneous polynomials in the corresponding varia- 
bles, any T having f for relative m-adic invariant can be changed to a T 
having f for absolute m-adic invariant by multiplying it into a suitable simi- 
larity-(m—1)-ad; and conversely, without qualification. Hence the 7’s corre- 
sponding to any one x-sequence represent the same m-adic collineations as the 


1940] POLYADIC GROUPS 347 


T’s having f for absolute invariant. All the m-adic linear groups correspond- 
ing to the various x-subgroups therefore have the same corresponding m-adic 
collineation-group as the G defined by f as absolute invariant, and our seem- 
ingly greater freedom is largely illusory. 

An obvious, but probably superficial, remedy for the relative triviality 
of our concept of m-adic invariant would be to allow each of the functions 
f', f'', ++ to be functions not of the variables of the corresponding 2 
alone, but of all of the 2’s. It may be mere prejudice that makes us object 
to thus uniting the m—1 spaces of m dimensions each into one space of 
(m—1)n dimensions; for, certainly, arbitrarily to give m—1 points, one for 
each space, is equivalent to giving one point in the combined space. One 
qualification does suggest itself. Corresponding to the condition of homo- 
geneity for the polynomial invariants of ordinary theory, §36 suggests that 
the f‘®’s be polynomials homogeneous in the variables of each 2 separately. 
However, a finally acceptable form for a general concept of m-adic invariant 
will probably involve changes in our original idea both more specific and 
more drastic than here suggested. 

40. Generalization of m-adic substitution and transformation groups. The 
concept of m-adic linear group is readily extended to that of an (m, u) linear 
group, analogous to our earlier (m, u) substitution group. However, both con- 
cepts admit of a far wider extension. We shall give this extension only for 
m-adic substitution groups, the generalization of m-adic linear group being 
entirely similar(*'). It is of interest to note that this generalization continues 
to be a generalization even when m =2., But the resulting ordinary groups are 
then essentially realizations of Specht groups, referred to in the introduction, 
or subgroups thereof('”). 

The concepts of an m-adic substitution on the letters of classes 
T,, T2,---, is associated with the cyclic substitution - - - 
on the classes themselves; for, under the m-adic substitution, T',;--T2, T.—-T's, 

-, More generally then let T,, T2,--- , I’, be any finite set of 
classes, o any substitution on those classes themselves as elements. s will then 
be said to be a polyadic substitution corresponding to o if, whenever o replaces 
class I’; by class T';, s carries the members of I’; in 1-1 fashion into the mem- 
bers of I';. Clearly, if polyadic substitutions s;, s2, - - - , S$» on the members of 
T,, T2,---, correspond to 01, 02, , Om respectively, 5152 - Sm, the re- 
sult of performing these m polyadic substitutions in succession, is itself a 


(#1) A corresponding generalization of our narrow concept of m-adic invariant immediately 
suggests itself. 

(#2) On the other hand, groups of the permutations of sets of variables considered by L. 
Weisner (Generalization of Lagrange’s theorem, Bulletin of the American Mathematical Society, 
vol. 32 (1926), pp. 629-630) are but a very special case of the present generalization with 
m=2, We may note that the associated and containing ordinary groups of m-adic substitution 
groups, and, indeed, of the present generalization thereof, also come under this generalization 
with m=2, and thus tie up with Specht groups, or subgroups thereof. 


348 E. L. POST [September 


polyadic substitution corresponding to o102 - - - Om, the product of the m cor- 
responding ordinary substitutions. It follows from our last result on homo- 
morphisms given in §4 that if G is an m-group of polyadic substitutions s on 
the members of T';, T'2, - - - , ', under the above m-adic operation, the corre- 
sponding ordinary substitutions ¢ form an m-group B of ordinary substitu- 
tions. Moreover, G is homomorphic to B. We shall call B the basic m-group 
corresponding to the polyadic substitution group G. In the case of our m-adic 
substitution groups, and more generally our (m, uw) groups, the basic m-group 
is of first order, its sole substitution consisting of a single cycle the number of 
whose letters is m—1 in the first case, a divisor .—1 of m—1 in the second. 

As a consequence of the homomorphism between an arbitrary polyadic 
substitution group G and its basic m-group B, we see that there are the same 
number of polyadic substitutions in G for each substitution in B. Hence, also, 
the order of G is always a multiple of the order of B. Again, the ordinary 
substitutions corresponding to the polyadic substitutions forming any sub- 
group of G will form a subgroup of B, if not B itself; while to each subgroup 
of B there is at least one corresponding subgroup of G, i.e., the one consisting 
of all the elements of G corresponding to the elements of the subgroup of B, 
and hence containing all such subgroups. 

For simplicity, we now restrict ourselves to mutually exclusive classes 
T;, Ts,---, I, of the same finite number of letters m each(?*). Given any 
substitution ¢ on those classes as elements, there will then be a total of (!)’ 
polyadic substitutions corresponding to o. If then B is a given m-group of 
substitutions on those classes as elements, and d is the order of B, the (n!)"b 
polyadic substitutions corresponding to the elements of B are readily seen 
to constitute a polyadic substitution group with B as basic group. It may be 
called the m-adic symmetric group of degree n with basic m-group B. We can 
now state that any polyadic group with basic m-group B is a subgroup of the 
corresponding m-adic symmetric group. On the other hand, a subgroup of 
that m-adic symmetric group may have but a subgroup of B for basic group. 

Of the theory of m-adic substitution groups we shall redevelop here only 
the general aspects of the theory leading to m-adic alternating groups. 
Again form the Vandermonde determinants Ay, As, - - - , A, for the letters of 
Ts, --- , respectively. If now a substitution ¢ on the I'’s as elements be 
written in the primitive form 


Tila 
a polyadic substitution corresponding to ¢ will transform the A’s as follows: 


(#63) When B is transitive, the number of letters in the several I'’s must of necessity be the 
same. 


1940] POLYADIC GROUPS 349 


To describe this transformation completely, we must therefore not only specify 
the 5-sequence 5= [6’, 5’’,--- , 5], but the substitution We therefore 
form the couple {a, 6 }. Given then a polyadic substitution group G, each ele- 
ment thereof uniquely determines a 5} couple. Moreover, if s1, 52, , Sm 
are any m elements of G, {o1, 5:}, {o2, 52}, {om, dn} the corresponding 
couples, then s=sisz--- Sm has a couple {o, 5} completely determined by 
the couples of 51, 52, , Sm. For clearly - om. On the other hand, 
let 5=[6’, 8’, ---, 8], 6:=[6/, 6/’,---, 6]. For any substitution o on 
the I''s as elements, if o carries I’; into T';,, write i;=i0. Then we will have 


It again follows from our last result on homomorphisms that the class of 
{o, 5} couples corresponding to the elements of G constitutes an m-group un- 
der the resulting m-adic operation on {o, 5} couples, and hence that G is 
homomorphic to this m-group. We shall call the latter the {o, 5} subgroup 
corresponding to G. The homomorphism in question then again assures us 
that there are exactly the same number of elements of G for each {¢, 5} couple 
in its {o, 6} subgroup, and again yields the many-one relation between the 
subgroups of G and those of its {o, 5} subgroup. 

Clearly the relationship between G and its {o, 5} subgroup is intimately 
bound up with the relationship between G and its basic m-group B. In fact, 
the very form of a {c, 5} couple yields a many-one correspondence between 
the elements of the {, 5} subgroup corresponding to G, and of B; while our 
formulation of the m-adic operation on {c, 5} couples shows this correspond- 
ence to be a homomorphism—hence again the sameness of the number of 
{o, 5} couples corresponding to different o’s, and the many-one correspond- 
ence betweer: the subgroups of the {o, 5} subgroup, and of the basic m-group 
B, corresponding to G. Much can now be said of the interrelations between G, 
its {o, 5} subgroup, and its basic m-group B. But they are all implicit in the 
fact that the above homomorphism between G and B is the one determined 
by the homomorphism between G and its {¢, 6} subgroup, and the homo- 
morphism between that {o, 5} subgroup and B. 

When G is the polyadic symmetric group of degree m corresponding to a 
given basic m-group B, then, as in the case of m-adic substitutions, G will 
have at least one polyadic substitution for each of the 2” possible 6-sequences, 
and each substitution ¢ in B, provided n>1. The “ {o, 5} subgroup” may now 
be called the complete {o, 5} group corresponding to B. With B of order 5, 
the corresponding complete {¢, 5} group is then of order 2’b. We thus have 
a division of the corresponding (m!)’b polyadic substitutions into 2’b mutually 
exclusive classes of consequently (m!/2)” members each. 

Now in the many-one relations between the subgroups of the polyadic 
symmetric group of degree n, the complete {o, 5} group, and the basic 


350 E. L. POST 


m-group B consider only those (proper) subgroups of the complete {o, 5} 
group which correspond to B itself. For each of these {o, 5} subgroups there 
is a unique largest subgroup of the polyadic symmetric group. These may 
then be called the polyadic alternating groups of degree n with basic m-group 
B. The corresponding {o, 5} subgroups are of orders 2b, 0S <y, and the 
polyadic alternating groups correspondingly of orders (m!/2)’2"b, each con- 
sisting of all the elements in each of 2”b of the above mutually exclusive 
classes. Note that if B is considered as a substitution group on the symbols 
T,, T:,---, TT, rather than on the classes they symbolize, then one and the 
same B will serve for arbitrary n. Hence also the complete {o, 5} group will 
be independent of ”; and for each m>1 there will be as many polyadic alter- 
nating groups of degree m and basic m-group B as the complete {o, 5} group 
has subgroups also corresponding to B. 

By considering an arbitrary polyadic group G of degree n, and with basic 
m-group B, a subgroup of the corresponding polyadic symmetric group, we 
see that the {a, 5} subgroup for G is actually a subgroup, proper or improper, 
of the complete {o, 5} group corresponding to B. But that subgroup also 
must correspond to B. That is, we have a many-one relation between all poly- 
adic groups of degree ” with basic m-group B, and those subgroups of the 
complete {c, 5} group which themselves correspond to B. 


COLLEGE OF THE City or NEw York, 
New York, N.Y. 


. 
. 


ON A MINIMUM PROBLEM IN THE THEORY OF 
ANALYTIC FUNCTIONS OF SEVERAL 
VARIABLES 


BY 
W. T. MARTIN 


1. Introduction. In 1932 Wirtinger(*) posed and solved the following prob- 
lem. Given a region G'in the complex z-plane and a (complex-valued) function 
(2, 2) =o(x+iy, x—iy) continuous and with continuous first partial deriva- 
tives with respect to x and y in G, to find an analytic function f(z) which 
gives the best approximation to ¢ in the mean-square sense, that is, such that 


(1.1) Jo — f|*dw, = min, 


where dw, is the element of area dxdy. (In the case in which G extends to in- 
finity he also assumed that ¢ e L? over G.) By use of the Green’s function 
G(z, 2; £, £) for the region G, he proved the existence (and uniqueness) of 
such an f and gave an Ts formula for f, namely, 


‘ ’ dws, 
IG 
when 0/02 =43(0/0x—10/dy), etc. In the case of the 
unit circle C the formula yields the result 


1 

a result which he also obtained directly by use of the Fourier series for ¢ and f. 

Recently Wirtinger(*) posed the analogous question in the theory of func- 
tions of several complex variables. In the case in which the region under con- 
sideration is a hypersphere H=E[|2;|2+ --- +]2,|2<1] and , 2n; 
Z1,° +, %,) is merely integrable over H, he obtained a (unique) solution by 
use of multiple Fourier series, namely 


! 


Presented to the Society, October 28, 1939; received by the editors January 22, 1940, 
This paper was received by the editors of the Bulletin of the American Mathematical Society 
September 26, 1939, accepted by them, and later transferred to these Transactions. 

(*) W. Wirtinger, Monatshefte fiir Mathematik und Physik, vol. 39 (1932), pp. 377-384. 

(?) W. Wirtinger, Monatshefte fiir Mathematik und Physik, vol. 47 (1939), pp. 426-431. 


351 


352 W. T. MARTIN, [September 


where dw; is the 2m-dimensional volume element. He conjectured that the 
question probably has a solution for general regions and for very general 
functions @ but that the solution appeared to involve difficult investigations 
on the extensions of Green’s functions. Now in various questions in the theory 
of functions of several complex variables Bergmann has been able to replace 
the theory of the Green’s functions by the theory of complex orthogonal 
functions and the kernel of a region(*). In this note we show that by the use 
of this theory of the kernel of a region we can solve the problem posed by 
Wirtinger for a very general class of regions (which includes all bounded 
regions) and for @ belonging to L?; indeed, we give the solution explicitly in 
terms of an integral involving ¢ and the kernel of the region (see equation 
(3.10)). 

It is known that results of this general nature have important applica- 
tions; for example in connection with the theory of entire functions of two 
variables Bergmann has solved the same problem with the function f bi- 
harmonic (the real part of an analytic function of two variables) rather than 
analytic(‘). 

For the sake of completeness we shall give in §2 a brief résumé of the re- 
sults from the theory of orthogonal functions and the kernel of a region. Also 
in the concluding section we consider certain extensions of the problem. We 
shall speak only of two variables; the case of m variables involves no essential 
changes. 

2. The kernel of a region. To every region of a wide class of four-dimen- 
sional regions there corresponds a kernel function which is defined as fol- 
lows(*). Let B be a region of this class and let {2‘”(z:, 22)} be a complete 
orthonormal system of analytic functions belonging to L? over B, so that 


(2.1) Z2)Q)(z1, Z2)dw, = Sy», wyv=1,2,---, 


where /B=limn../B,, and {B,,} is a system of regions in B converging to B. 
The series 


vel 


(*) For the development of the theory of the kernel of a region see S. Bergmann, Mathe- 
matische Zeitschrift, vol. 29 (1929), pp. 640-677, Journal fiir die reine und angewandte Mathe- 
matik, vol. 169 (1933), pp. 1-42, especially pp. 1-5; vol. 172 (1934), pp. 89-128. We shall refer to 
these papers as B; and Bz, respectively. 

(*) S. Bergmann, Mathematische Annalen, vol. 109 (1934), pp. 324-348, especially p. 333; 
Compositio Mathematica, vol. 3 (1934), pp. 137-173. We shall refer to these papers as B; 
and B, respectively. 

(°) The results which we state in this section are all given by Bergmann in the papers listed 
in footnotes 3 and 4. We shall restrict ourselves to simply-connected bounded regions but the 
results are true for any region for which there exists a set of linearly independent functions be- 
longing to L’. 


, 1940] MINIMUM PROBLEM FOR ANALYTIC FUNCTIONS 353 


converges absolutely and uniformly for (z) and ({) in any regions interior 
to B and accordingly defines a function of 21, 22, 1, analytic for (z) and (£) 
in B (see Bz). The sum function is called the kernel of the region B and is 
denoted by K B(2:, 22, £1, €2). It is known that the function depends only upon 
the region B and not upon the particular set of orthonormal functions used 


in defining it (see B,). Concerning series in terms of the 2” it has been shown 
that the series 


2 


(2.3) > a,Q”) 


can be integrated term-by-term over B whenever >, | a,| 2< @ (see Bs, p. 331). 
3. Solution of the problem. Let (21, 22; 2:, 22) be a complex-valued func- 


tion of the four real variables x1, x2, yi, ye, defined and of integrable square 
over a bounded region B 


(3.1) J, < 


We seek a function f(2:, 22) analytic and of integrable square over B and such 
that 


(3.2) f|%dw, = min. 


For the solution let {2 (21, £2) } be a complete orthonormal set of analytic 
functions belonging to L? over B and let us seek to determine coefficients 
{a,} subject to the condition 


(3.3) >| < @ 
1 
in such a manner that the function 
(3.4) 22) = D> (24, 22) 
1 


furnishes a minimum to (3.2). If we substitute (3.4) into (3.2) we find 


Sol 4.00) od | “de 


Using (3.3) and the results stated in §2 we see that we may integrate term- 
wise, thus 


(3.5) 


354 W. T. MARTIN [September 


(3.6) J, — f|*dw = — >> (a,b, + db, — a,4,), 


where we have written 


(3.7) f 


By Bessel’s inequality 
(3.8) >| < 


Treating a,, d, as independent complex variables and differentiating with re- 
spect to d, (or a,) we see that Euler’s conditions for (3.6) to be a minimum are 


(3.9) a, = b,, v= 1,2,- 


Clearly this choice of the a’s furnishes an actual minimum (we shall also give 
a direct proof of this fact in equation (3.12) below). Thus the minimizing 
function f has the form 


f(z1, 22) = >, 22) £25 $1, $2) Fe) dere 
1 
1 


where we have again used the fact that we may interchange the order of in- 
tegration and summation. 
Thus we have answered Wirtinger’s question. 


THEOREM. Let B be any (four-dimensional) region for which there exists an 
infinite system of linearly independent analytic functions of L*. (In particular, 
let B be any simply connected bounded region.) Let (21, 22, 21, 22) be of integrable 
square over B. Then the function f defined by 


(3.10) (21, 22) = £25 F2)K 22, 


where K@ is the kernel of the region B defined as in (2.2), is analytic and of 
integrable square over B and furnishes the unique minimum to the integral (3.2) 
over the class of analytic functions of integrable square. 


Very many different properties of the kernel function are known which 


=z 
= 


1940] MINIMUM PROBLEM FOR ANALYTIC FUNCTIONS 355 


yield various properties of the minimizing function f; for example if g(2:, 2) 
is any analytic function of L? over B then(°) 


(3.11) fig Dada. = 0. 


This result is the analogue for the region 8 of a result obtained by Wirtinger 
for the hypersphere (loc. cit., footnote 2, equation (8)). It also obviously 
furnishes a direct proof of the fact that the function f defined in (3.10) yields 
a minimum for (3.2), since in view of (3.11), if g#0, 


= fis-ol+ filel> 


4. Special regions. For many special regions the kernel function has been 
given explicitly, for instance in the case of a Reinhardt region in four-dimen- 
sional space 


(4.1) R = E[| 22|? < G(| 0 2:| < 1], 


where G is once differentiable in (0, 1), the kernel has the form (see B;) 


(3.12) 


If we have a region R* which can be mapped into R by means of a transforma- 
tion 2,=2,(W:, We), K=1, 2, where the z, are analytic in R*, then the kernel 
function for R* is equal to the kernel for R multiplied by the two jacobians of 
the transformation (see Bi, p. 5): 


we, = [ We), Z2(W1, We), 21(E1, E2), E2)) 


D(21, 22) D(21, 
D(ws, ws) &2) 


In different cases the series in (4.2) can be summed, for example in the 


(4.3) 


(*) We may see this fact directly in view of the form of f and the orthogonality of the 
0’s, or we may note that the corresponding result for the case of biharmonic functions has been 
proved by Bergmann (see Bs, p. 333). In order to see it directly let us write c,= /gQ”. Then 
22) 2” 22). Also by (3.4), (3.7) and (3.9) f(zi, 22) #2) where b, 


= Thus 
= — = 0. 


356 : W. T. MARTIN [September 


case of a region of the form 
(4.4) a| 2 +| <1, p integral, p > 0,0 <a 31, 


the kernel is (see B;) 


(p + 1)(1 — + (p — 
— 2ef2)? — 
which yields for the hypersphere H=E||2:| 2+ |2|2<1] the result 
2 
w*[1 — — 
If we put this into (3.10), then we see that for the hypersphere H our result 
is identical with the formula (1.3) obtained by Wirtinger (for n =2). 


It is perhaps worth while merely to mention that in the case of a bicylinder 
| z.| <r,, Kk=1, 2, the kernel has the form (see B;) 


(4.5) K(21, 223 £1, $2) = a?(1 — 


(4.6) Ku (21, 22; £1, £2) = 


22 


It is also interesting that in the case of simply connected regions in the 
complex z-plane the-kernel is simply the expression 0°G(z, ¢)/dz0f where G 
is the Green’s function for the region(’). This of course is in agreement with 
the result (1.2) of Wirtingers’ mentioned in the introduction. 

5. Extensions. A very important variation of the problem in the theory 
of functions of one complex variable is the case in which the integration is 
over the boundary curve. In the case of two complex variables, when the re- 
gion under consideration has a distinguished boundary surface, the analogous 
problem may be solved and since there is a general theory of orthogonal 
functions and kernel functions related to the distinguished boundary sur- 
face(*), the same formula for f as in (3.10) is obtained, with of course the 
kernel K defined analogously. 

Moreover we may ask not only that f be analytic and of L? over B and 
minimize the integral (3.2) but also that f be subjected to certain additional 
conditions, for example that 


(7) The kernel for doubly connected regions in the complex z-plane has been calculated by 
K. Zarankiewicz, Zeitschrift fiir angewandte Mathematik und Mechanik, vol. 14 (1934), pp. 
97-104 and by P. Kufareff, Bulletin de l|’Institut Mathématique et M échanique, Tomsk, vol. 
1 (1937), pp. 228-235. 

(*) See Bergmann, Bulletin de |’Institut Mathématique et Méchanique, Tomsk, vol. 3 
(1935-1937), pp. 242-257. 


1940] MINIMUM PROBLEM FOR ANALYTIC FUNCTIONS 357 


where {¢*’, tS} e B. We shall merely indicate the proof in the case p=1. Our 
problem is then to find an f analytic and of L? over B, which minimizes the 
integral (3.2) and which takes on a given value X at a fixed point (t,, ¢2) in B, 


(5.2) S(ti, te) = X. 
The analogue of (3.6) is 


ff alo x) - - 
-f | ¢|? > (a,b, + a,b, — a,d,) — af x] 


— — X] 
where , uw are the Lagrangian multipliers. Euler’s conditions are 
(5.3) 4, = 5b, +d2(H), y= 1,2,-- 
Thus u=r and the condition (5.2) yields 


(5.4) 


Thus the minimizing function f has the form 


f (41, 22) = $25 $1) $2) KB(21y 925 $1, $2) des 


(5.5) 
X — f BolSs $2, $2) K Bltry 


K g(t, te; hy be) 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 


+ K g(21, 22541, 


i 


ANALYTIC SYSTEMS OF CENTRAL CONICS IN SPACE 


BY 
J. L. COOLIDGE 


The amount of literature dealing with conic sections, individual curves 
and systems of curves, in one plane, is vast. When however we are dealing 
with a number of conics, not in the same plane, the situation is quite different. 
Certain figures, as the focal conics of a set of confocal quadrics, are familiar 
enough, but very little has been done in the way of a systematic study of more 
general systems. There are some studies carried out with the aid of purely 
synthetic methods; the algebraic or analytic treatment lags behind. 

The first writer to suggest a reasonable set of coordinates for a conic in 
space was Spottiswoode(*). The totality of straight lines that intersect a conic 
in three-space generates a very special sort of quadratic complex. The coeffi- 
cients determining the equation of this complex, when a straight line has the 
usual Pliicker line coordinates, may be taken as the coordinates of the conic, 
a clumsy enough system. A much better technique, perhaps the best for alge- 
braic purposes, was developed by Johnson(?). Here a conic is looked upon 
not as a locus, but as the envelop of its tangent planes. Thus its tangential 
equation a‘‘u,u;=0 gives ten homogeneous coordinates, connected by a quar- 
tic identity 


ati=agit, | ati| =0, i,j = 1, 2, 3,4. 


I think that this gives the best approach to the study of algebraic systems of 
conics, and I regret that more attention has not been given to the subject. 
For instance a complete study of linear and quadratic systems would be inter- 
esting. When it comes to attacking differential properties of conics this tech- 
nique is disappointing, even as the Plt, ker line coordinates are of compara- 
tively little use in studying the differential properties of systems of lines. In 
fact as far as I can make out very little has been written about the differential 
geometry of systems of conics. The most important article I have been able 
to find was by Blutel(*), and his problem is very special. In what follows I 
am going to outline what seems to me the most promising way to approach 
the subject, and give a certain number of theorems. I hope that others may 
feel inclined to carry the study further, even though present mathematical 
fashion is concerned with very different questions. 

Presented to the Society, April 27, 1940; received by the editors December 1, 1939. 

(*) On the twenty-one coordinates of a conic in space, Transactions of the London Mathe- 
matical Society, vol. 10 (1879). 

(*) The conic as a space element, these Transactions, vol. 15 (1914). 


(*) Recherches sur les surfaces qui sont en méme temps lieux de coniques et enveloppes de cones, 
Annales de I’Ecole Normale Supérieure, (3), vol. 7 (1890). 


359 


BOSTON UlviVERSITY 
COLLEGE OF 


360 J. L. COOLIDGE  . [November 


1. Series of conics. The most obvious way to approach the study of the 
conic in space is to treat it as a rational curve. Let us use nonhomogeneous 
Cartesian coordinates, and assume that our conic is not a parabola. We may 
then write its parametric equations in the form 


1 
4 = 1, 2, 3. 


Here, (a) and (c) give the directions of the asymptotes, (b) are the coordinates 
of the centre and, if we assume that ¢=1 gives a vertex, #? is the ratio of the 
distances from the asymptotes. 

I give these equations because they seem to offer a favorable opening for 
the study of systems of conics, and in fact I personally first tried the problem 
in this way. I hasten to add that I was not at all able to attain the results 
which I believe to be easily attainable. 

I turn to a different method which seems to fit the case even better. This 
is the method of moving axes first developed by Darboux in the opening 
chapters of his Théorie Générale des Surfaces, and extended in recent years 
by Cartan. Let a point have the rectangular Cartesian coordinates (X*) with 
regard to a set of fixed axes. Its coordinates with respect to a moving set of 
such axes shall be (x*). The coordinates with regard to the fixed axes of the 
moving origin shall be (X$). We then have the fundamental relations 


(1) x’ = xi + > 


| | =1, 
Let the position of the point and also the situation of the moving axes be 


functions of a parameter v, which for simplicity of language I shall call “time.” 
We then have 


(2) 


I now seek the components with regard to the moving axes of the total veloc- 
ity of the point. We write these 0x‘/dv, while we mean by the notation 5x‘ /dv 
the velocity with regard to the moving axes of the point’s motion with regard 
to those same axes: 

6 


xi 
(3) 


(4) 


It is to be remembered that ||a;,|| is the matrix of an orthogonal substitution 
of determinant 1. 


ax aX. 4 
dv dv av $0 
k 

= — Vit = = —— - 

Ov Ov 


1940] CENTRAL CONICS IN SPACE 361 


These are the general formulae for moving rectangular axes in any num- 
ber of dimensions. For our particular problem, let us assume that our conic 
lies in the plane x*=0 and that it is expressed parametrically 


(5) x! = a cos u, x? = b sin u, x* = 0, 


Let us further simplify the notation by oe 1, j, k asa cyclic permutation 
of 1, 2,3 and putting 


(6) = — 
We have then our fundamental formulae 
Ox! Ox! da si 
— = — asin 4, — = — cos u — sin 4, 
au av av 
Ox? Ox? 0b 
— = bcos 4, — = + — sin u + psa cos u, 
Ou ov Ov 
Ox? 
= 0, — = &+ p,b sin u — pea cos u. 
Ov 
We have further in the classical notation 
E = a’ sin? u + 5b? cos? u, 
1 30(b? — a?) 
ov 


G= + 2 + (&p3 — cos 


F = bt cos u — at! sin u + — cos usin 4 + p3ab, 


+ + ‘pnb sin 


+ + pia + cos 


da 0b 
2| (= —a + | cos “sin 
dv dv 


+ + pi)b + sin’ 4, 


x?) da 
Ss - sin + bE cos u + cos? 
v 


v) 


0b 
— p3(b? — a*) cosusinu +a rs sin? u|, 


| 
| 
| 


J. L. COOLIDGE 
(EG — F%)'2D = — ab—, 
av 


A(x', x?) ax? da 
(10) (EG — = —— 


0b 
v) Ov ov ov 


+ p3(b? cos? u + a? sin? »}. 


If we have given two conics, they may have any one of the five following 
relations: (a) they do not intersect, (b) they intersect once, (c) they intersect 
twice, (d) they touch, and (e) they may be coplanar. Omitting the last case, 
when we are considering a one-parameter family of conics in space, we have 
to distinguish the cases where adjacent conics do not meet, or where they 
meet once, or where.they meet twice, or where they are tangent to one an- 
other. Or to put the matter in more exact language, they may not all touch 
any curve, or they may all touch a curve, or they may touch two curves (or 
one curve twice), or they may all touch a curve and lie in the corresponding 
osculating planes. Let us verify these statements analytically. 

If the conics of a series touch one curve, it must be possible to make u 
such a function of v that 


ax! 
This involves 
Ox? a(x', x?) 


ov v) 

If we replace the sine and cosine of u by (2¢)/(1+#*), (1—¢*)/(1+-#*), where t¢ 
is the tangent of half of the excentric angle, we have the condition that the 
resultant of a quadratic and a quartic polynomial in ¢ should vanish, which is 
a bit long to write out, but involves no theoretical difficulties. There is more 
interest in the case where the conics touch two curves. 

These curves lie on the developable surface generated by the plane of the 
conic, the characteristic line being the intersection with 


The points where 0(x', x*)/0(u, v) =0 must include the two intersections of 
the conic with 0x*/dv=0 so that 
O(x', x7) dx? 


(11) 


| 


1940] CENTRAL CONICS IN SPACE 363 


Again look at the matter geometrically. When two conics intersect twice, 
they lie on a pencil of quadric surfaces, two of which are cones, and the ver- 
tices of these cones are harmonically separated by the planes of the conics. 
When the conics are infinitely near, one cone tends to be squashed between 
them. It appears then that if the conics of our series are twice tangent to a 
curve, the tangent planes to the surface generated at all points of a conic 
pass through a common point and envelop a cone. Now let us look at the mat- 
ter analytically. The equation of the tangent plane is 
8(x?, x*) x*) 

X! — x!) —— + (X? — + (X* — ——__ = 

O(x', x*) 

0) 

The reader should not confuse X appearing here with that in (1). This be- 
comes in the present instance, thanks to (11), 


Xb cos u + X%a sin u + X*[a cos u + 6 sin u + y] — ab = 0. 


[X*(b cos u) + X%(a sin u) — ab] = +X 


It appears then that the point 
X' = — aa/y, X* = — bB/y, X* = ab/y 


is in the tangent plane at every point of the conic. Conversely, when these 
tangent planes pass through such a point, we have an identity in v and 
O(x!, x*)/A(u, v) is divisible by 0x*/dv so that the conics touch two curves or 
are the limits of conics touching two curves. 


THEOREM 1. If the conics of a series are tangent to two curves, the tangent 
planes to the surface generated at all points of a conic will envelop a quadric cone 
which touches the surface all along the conic. 


Now consider the dual. We have a one-parameter family of quadric cones. 
If adjacent cones tend to touch twice, that is to say, if the cones are inscribed 
in two developable surfaces, they will also intersect in a conic. 


TuHEoreM 2. If the quadric cones of a one-parameter series be inscribed in 
two different developables, the characteristic curves of these cones will be the 
generators of these developables and a series of conics tangent to two curves, or 
the limit of such a series. 


The surfaces generated by these conics are the ones considered by Blutel 
(q. v.). 

There remains the case where adjacent conics tend to touch. This means 
that 0(x', x”) /0(u, v) is divisible by 0x*/dv but the line 


| 
| 

i] 

i 


364 J. L. COOLIDGE — [November 


is tangent to the conic. If P be the point of contact, its line of advance is 
along the conic, and also along the characteristic line whose equation has just 
been written. Hence P must be the point of contact with the edge of regres- 
sion. The plane of the conic must then be the osculating plane for the curve 
generated by P. Hence we have a series of conics tangent to a curve each 
lying in the corresponding osculating plane. Here also there will be a quadric 
cone tangent at all points of the conic. 
We have assumed (11) here, that is, 
O(x!, x7) ( 
——— = — (a cos u sin 
au, 
This identity will lead to the equations 


da 0b 
— praa+ y= pibB + = —a—; 

dv dv 

(12) 
piba — prop = p3(a* — 6%), 
at? — yap, = — bf}, BE + ybpi = — 

We see geometrically that if two conics lie on the same quadric cone, the gen- 
erators of the cone establish a projective relation between them. When the 
conics are infinitely near, the generators give the directions of the curves 
conjugate to the conics in the surface generated. Analytically, the differential 
equation for the curves conjugate to the conics 6v=0 is 


Ddu + D'dv = 0, 
A(x', x7) ax? da 0b 
dv v) dv dv Ov 
+ p3(b? cos? u + a? sin? u)dv = 0, 


In the present case this is 


— abdu + [ cos + pea sin u)(a cos + B sin u + 


da 0b 
Ov Ov 


+ p3(b? cos? u + a? sin? oh] dv = 0. 


In view of (12) this becomes 
du + [L(v) + M(v) cos u + N(v) sin uldv = 0. 
Let us now introduce the tangent of the half-angle, so that 


CENTRAL CONICS IN SPACE 


i-# 2t 


’ inu= du 
1+? 1+? 


cos 4 = 


+ A(v) + B(v)t + C(v)e? = O. 


This is a Riccati equation, characterized by the fact that the cross ratio 
of four solutions is constant. This gives 


BLUTEL’s THEOREM 3. If the central conics of a series be not coplanar, but 
touch two curves, the conjugate curves on the surface they generate will establish 
a projective correspondence among them(*). 


Let us now look at the orthogonal trajectories of conics. Their differential 
equation is 


(13) Edu + Fdv = 0. 


Introducing the tangent of the half-angle as before, we have 


+ (40% — + b?]dt + sera — 


0(b? — 
— + #7) + — + psab(1 + (1 + = 0. 
These trajectories will establish a projective correspondence if this is a Riccati 
equation. If b?=a?, the equation reduces automatically to the Riccati form. 
Suppose that b?#a?. Then 1+/? cannot divide the coefficient of dt and the 
first factor in the coefficient of dv must be proportional to the coefficient of dt. 
Evidently, the factor of proportionality must be zero, so that F=0 or 


f= 0, & = 0, ps = 0, b? — a? = k, 
where k is a constant, not zero. 


THEOREM 4. The necessary and sufficient condition that the trajectories or- 
thogonal to the central conics of a series should establish a projective correspond- 
ence among them is that the conics should be circles; or the centre should be fixed 
or move orthogonally to the plane, the distance between the foci should be constant 
and the axes should not twist. 


Let us now try to discover under what circumstances these orthogonal 
trajectories are geodesic curves of the surface. The necessary and sufficient 
condition for this is that 

EG 
——— } = 0. 


(*) Biutel, loc. cit., p. 155. 


1940] 365 
- 
=—_; 
1+? 
Ou | E | 


366 J. L. COOLIDGE i {November 


This shows that F?, and so F, is divisible by E when the roots of E are 
distinct, and as they are of the same order in ¢ when we substitute the tangent 
of the half-angle, the factor must be a function of v: 

F = f(v)E, 
f(v) [a? sin? u + 5? cos? u] 
1 — a’) 


It follows from this identity that f(v) =0, F=0 or =£2 = p,=0, b?—a’?=c, 
where c is a constant, not zero. 
Thus 0G/du=0, or 


§*p,b cos u + sin u + pip2ab(sin? u — cos? u) 


ab\? /aa\? 
Ep: = Eps = pipe = pid — pro + (=) ~ (=) =0 


If £0, then ~:=0, p2=0, 0a/dv=0b/dv=0. We have a conic of fixed 
axes generating a right cylinder. 
If £*=0, we have a fixed centre, and either 


0 (=) =) 
ov ov 


0 (= (=) 2,2 
ov dv 
The distance between the foci is constant, the plane rotates about one axis 


which has a fixed direction. 
There is the second case where 


F = [at* cos u — sin u + psa’). 
We get from 0(EG — F*)/du =0 that 
G = cos — u + psa]? + 


da da 
— Sap. = 0, + ap, = 0, 


a(pi—p)=(E) 


Hence 
22 
or 


1940] CENTRAL CONICS IN SPACE 367 


One solution of these equations is ;= p,=0, {1 =£*=0, and this gives rise 
to the parallels of a surface of revolution. 

A second solution is da/dvu=0, &=0, +api, tape. It is readily 
shown that the last three of these equations constitute necessary and suffi- 
cient conditions that the centre of the moving circle lie on and move in the 
direction of the characteristic line of the plane of the circle and have a velocity 
which is +a times the angular velocity with which this plane turns about the 
characteristic line. The conditions also guarantee that the planes of the circles 
are the osculating planes of the locus of the centres. Thus, the circles have 
their centres in the points of a twisted curve of constant torsion 1/a, lie in 
the osculating planes of this curve, and have the constant radius | a| (*). 

All other solutions of the four equations are imaginary. 


THEOREM 5. If the central conics of a series are geodesically parallel, but are 
not circles, either they are the right sections of a quadric cylinder or the centre and 
direction of one axis is fixed, and the distance between the foci is constant. If the 
conics are real circles, either they are the parallels of a surface of revolution, or 
they have their centres in the points of a twisted curve of constant torsion 1/a, 
lie in the osculating planes of this curve and have the constant radius |a| 5, 


Let us next inquire under what circumstances the conics will be lines of 
curvature. A plane curve will be a line of curvature if the normals to the sur- 
face all along it make the same angle with the plane, or what comes to 
the same thing, the normals to the curve making a certain constant angle 
with the plane are normal to the surface. If C be the tangent of the angle 
which a normal to the conic makes with the x’ axis, the direction cosines of 
this normal are proportional to 


b cos u, a sin 4, — C(b? cos? u + a? sin? u)*/2, 


This will be normal to the surface, that is to say, normal to 0x/dv if 


0b 2 
[ cos u + £a sin u + b= cos! u+ sin? — p3(b? — a*) cos u sin «| 


= C*(b? cos? u + a? sin? + sin u — cos u)?, 


This is to be an identity in u. The left side is a perfect square, hence either 
the right side is, or both vanish identically. Excluding this case, the right is a 
perfect square if b? cos? u+-a? sin? u is a perfect square, and this involves a=b 
so that we have a circle. The evolutes of a circle are points, a sphere will touch 
the surface all along the circle, or the surface is the envelop of a one-parameter 
family of spheres. 

Suppose, next, that each side vanishes identically and that C=0. Then 

da 


(°) This possibility was pointed out to me by Professor Graustein. 


368 J. L. COOLIDGE {November 


C=0 gives the fact that the plane of the conic is orthogonal to the surface, 
£1=£2=0 the centre is fixed, or moves orthogonally to the plane, da/dv=0b/dv 
=0 that the lengths of the axes are constant. If p;=0, then 0x!/dv =dx?/dv=0 
and every point moves orthogonally to the plane. If b=a, the surface is the 
envelop of spheres of constant radius, and is therefore a canal surface. 

If, on the other hand, C0, then £*=p,=2=0 and 0x‘/dv=0, so that 
there is no surface, unless b=a, in which case we have the circles as before. 


THEOREM 6. The necessary and sufficient condition that the central conics of a 
non-planar series should be lines of curvature is that either they be the charac- 
teristic circles on a one-parameter family of spheres, or that they be invariable 
in size and shape and invariant in their planes and so generate a surface of 
Monge. 


2. Congruences. Let us pass to two-parameter systems, or congruences. 
Let us call the parameters 1, v2, putting subscripts 1 or 2 to the notations of 
(3), (4), (6), (7), (8) to indicate the variable with regard to which the differ- 
entiation has been performed. Let us look for the focal points, which we de- 
scribed geometrically as the points where a conic meets an infinitely near 
one. Analytically this means that when a certain relation has been estab- 
lished between v,; and v2 we can make u such a function of these variables 
that the tangent to the curve traced is the same as that to the conic. This 
again will involve three relations 


Oxi Ox 


Ox 
— dv, + — dvze + A— = 0. 
Ov; Ve Ou 


Setting the discriminant of these three linear homogeneous equations equal 
to 0, we get a cubic expression in cos u, sin u which will have six roots. 


THEOREM 7. The central conics of a congruence will usually have six focal 
points where they touch six surfaces or meet certain curves. 


This number is in accordance with a result of Darboux’s where it is 
shown(*) that where a congruence is composed of plane curves of order m 
the number of focal points is m(m+1)., When our central conics are circles, 
two focal points are on the circle at infinity, we usually overlook them and 
say that the circles of a congruence touch four surfaces. 

Let us now inquire under what circumstances the conics of a congruence 
are orthogonal to a surface. For this purpose, u must be such a function of 1 
and v that 

Ou Ou 
E—+F, = E—+F,=0. 
Ov O02 


1 


(*) Théorie Générale des Surfaces, vol. 2, p. 4. 


. 


1940] CENTRAL CONICS IN SPACE 


(= OF ) 
F.{ — — 
ou 
Developing this at length, we get a rather fearsome equation 
“| + 5?) + 5?) 
2 Pos Pas Ove 
+b) aa +b) 
+ cos u 
2 2 Ove 
1 2 2 
O(a +5) 
+ ab (pati — ptt | 
1 


+ cos? u + a? sin? 9(abps,) 
Ov, 


( Ove | 


The condition of integrability will be 


(14) OF 2 —~) 
ave 


+ (tite — | 


+ a pats | 


1 1 
+ cos? usin (= =~) 
Ove 
2 2 
0b 0b: 
+ cos u sin (= 


Ove Ov; 
1 


1 
+ sin® ua? (= = 0 


Ov, Ove 


b a) 2 a) 2 
2 Ove 0; 


If we replace u by ¢ as before, this equation becomes sextic. 


THEOREM 8. If the central conics of a congruence be normal to more than six 
surfaces, the congruence is a normal one. 


The conditions for a normal congruence will, then, be that the left side 
of this equation should vanish identically, for all values of u. Now if we have 
an expression 


Ao + A; cos u + B, sin u + Az cos? u + Bz cos usin u 


+ Cz sin? u + A; cos* u + B; cos? u sin u + C3 cos u sin? u + Ds; sin* u = 0, 


we shall find the conditions for vanishing identically by putting u succes- 
sively equal to 0, 1/4, 1, 2, 34/4, 7, 34/4, 34/2, 20/4. This will give 


Bz = Ap +Az2=AotC2 = Ai = Bi =A1+C3 = Bit Bs = 0. 


369 


370 J. L. COOLIDGE [November 


Assuming, then, that b?a?, we find 
Ao = Ap = Be = C2 = = Bi + Bs = Bi +d; =0; 
b) 
O(01, V2) 
O(pa,ab) _ 0 
Ov2 
O(a? + 5?) + 5?) 


1 2 


a be a be 
dv, \(b? — a?)1/2 dv, \(b? — a?)1/2 


a? 2 0 \ a? 

b? 1 9 \ b? 

(b? — a?)1/2 (b? — 


The most important of these equations is (I) which gives 


(I) 


(II) 


1 
(IIT) ' E 


0 
(VI) (pati — pats) + £2 — log 
00; 


2 2. 1 te) 
(VII) (pat: — — & —— log 


THEOREM 9. The semi-axes of the central conics of a normal congruence are 
functionally related. 


Let us next assume that we are in the special case where 


(16) tits — fits = 0. 


This means geometrically that either we have a fixed centre, or the centre 
traces a curve, or that at each centre the plane of the conic is orthogonal to 
the tangent plane to the surface of centres. From (III) follows 


THEOREM 10. If the central conics of a normal congruence have axes of fixed 
lengths, the centre will be fixed, or trace a curve, or a surface orthogonal at each 
point to the plane of the corresponding conic. 


Assuming that (16) still holds, but the lengths of both axes are not fixed, 
we write 


1 1 dy, 1 dve 
= 0, 
(17) 
edt = 
dt 


CENTRAL CONICS IN SPACE 


(18) 
psdt = + by at = 


Here ¢ is an arbitrary variable not tan v/2. By (16) the first two have a com- 
mon solution, and then, by (I), (II) and (III) the last two have a common 
solution unless a*+)?=const. Suppose, first, that all four have a common 
solution. We are then at liberty to assume 


(19) da 0b 


We have also the additional equations 


1 2 
0 
(20) 


All seven of our equations (I)—(VII) are satisfied. Assuming that we have 
a surface of centres, the curves v,;=const. thereon correspond to constant 
values for the lengths of the axes. Let us take v2 =const. as the curves orthogo- 
nal to them. The equations &=#=0 tell us that the curves »,;=const. are 
orthogonal to the corresponding planes of the conics, hence the curves 
ve =const. are tangent to the planes of the conics, or lie in them. But now we 
find from the first two equations (7) that the instantaneous motion of every 


point of the conic, when 1 is constant, is orthogonal to the plane. The equa- 
tions 


a axe, a 
Ove Ov) Ove 


Since 0X o/dv; is in the plane of the conic 


aXe axi 
= + = | on + | 4 aj2, 


0; V1 


a’xi ( aXo 
= | an + | 
00,002 00; / Ov; / Ave 


But by (4) and ~3,=0 


v1 


1940] eel 371 
Oa dry; Oa 
da = — —dt+— —d= 0, 
dt dt 
1 2 
0 0 
Ove Ove 
give by (4) 


J. L. COOLIDGE 
a; =| =| 
Ove Ove 


001002, 001002, 
Hence = p(0X}/dv2) and 


axt aXe 
Ove 0 Ove 


aXe | an,| 
Ove Ove 


This means that the normals to the plane all along the curve v.=const. 
are parallel and this curve must be in the plane, not merely tangent to it. 
There are, hence, only a singly infinite number of planes, so that in each there 
are an infinite number of conics. We have moreover the equations 

1 2 
Ops 


Ove Ove 


From the equations (7) when v=», everything is independent of 12. Hence we 
have the same series of conics in all our planes. The congruence is generated 
by an immovable set of conics in a singly infinite set of planes. Conversely, 
such a set of conics will clearly generate a normal congruence. 

When the centre traces a curve, if 0 we may repeat our previous rea- 
soning, 0X}/dv2#0 and we have in each plane a one-parameter family of con- 
centric conics. If &=0, 0X}/dv.=0, then 


a ( axe ) 
ah, — ] = = 0, = 


As before 0X4/d0,:=dan+puan, &=0. This means that the curve traced by 
the centre is tangent to the plane of the conic. Our last expression can be 


written better 
axy ( ) ( axe ) 
=(a a a 
an, kA i an, 


Remembering that the left is independent of v2 and p;,=0, 
ax, aXe 
= (on 


Ov, Ov; 


372 [November 
0a; 
(on =) = ( = 0. 
Ove Ove 
Hence 
: 
k 
cD, 
Ove 


1940] CENTRAL CONICS IN SPACE 
Differentiating to v2, remembering = ps, =0, 


2 
tex 


Differentiating to 1, 


The centre traces a plane curve through the fixed origin. As this can be any- 
where, the curve must be a straight line. Hence we have a series of conics 
whose centres lie on a line while the plane is rotated about that line. 

There remains the possibility that the centre of the conics should be fixed. 
Here we are back on the first case, we have a set of invariable conics whose 
planes envelop a cone with its vertex at their common centre. The reasoning 
is reversible in each case. We note that in every case we have a one-parameter 
family of conics immovable in a moving plane. 

I return to the equations (17) and (18) and assume that the first two still 
have a common solution, and hence, that the last two have, when a?+3? is 
not a constant, but that the solutions are different. Here we are free to as- 
sume that 


da 1 2 
From (IV) and (V), 


= —————- $22), = ————— 92(). 
a b 


From (II), ps, is a function of v, alone. From (V1), &/& is a function of v; alone. 
Hence ¥2/¢2 is a constant. We may change variables writing ¢2=v2, ~2 = kv2. 
Then, from (VI) and (VII) 


og 


00,002 001002 
Ove Ov} Ove 
Hence 
+CXi=0 
a? kb @ 
= — —— log = — — 
+) - 


374 J. L. COOLIDGE 
Putting a/b=p, 

_ +1) ap 

kp? 

Again, eliminating 3, from (VI) and (VII) 


a? — log ————-_ — log —————__ = 
00; (b? — a?)1/2 (b? — a?)1/2 


This can be written 


da 0b log (b? — a*)1/2 
2a — + 2k*b? — = (a? + 
an dn 


a? + k*b? = C(b? — a’), 
where C is a constant since a and b are not functions of v2. Hence p=const., 


ps3,=0. Hence from (VI) and (VII) we either have a and b constant which 
gives 0a/dv,;=0b/dv,=0 and throws us back on a previous case, or else 


ts = = ps, = 0. 


Here the centre is either fixed or traces a curve orthogonal to the plane of 
the conic. Again we are on a previous case. 

Next we assume a*+5?=const. Then from (III) the equations (17) have 
a common solution. Assume that this is a solution of the first equation (18). 


We may write 


1 2 Oa 
= =— = 0, 
bo = &e 


From (VI) and (VII) 
Puts = Puts = 0. 


If ps,=0, we have equations (19) and (20) and we can proceed as before. If 
ti=£=0, the centre is either fixed or traces a curve which cuts the plane 
orthogonally. There is but a singly infinite set of planes. We may take the 
parameter v, to give the plane. As 0a/0v.=0b/dv2=0, we have in each plane 
the same set of concentric conics. 

Suppose now that the solutions of (17) do not give a solution of (18). 

We may write 


&=h=0. 


We may assume a=cos 1, b=sin 1. From (V) and (IV) 
(tan? — 1)1/2 


tan 7 


te = di(vs)(tan? — 1)"/2, = 


da 0b 
Ove Ove 


1940] CENTRAL CONICS IN SPACE 375 


Now if p3,=0, 0p3,/00: =0, and we are back on a previous case. If p:,0, we 
eliminate it between the equations (VI) and (VII) and find that $/# is a 
function of v, alone. Hence ¢2(v2) =k(¢:(v2)) where & is constant, and 


2 1 
a 2 
log (b? a*)1/2 + (=) 


— tan’ 7; 7; tan? 7; 
_ k? = 0, 


tan? 1 1 — ctn? 
tan? 9; + k? 
tan? 7; — 1 


’ 


so that tan 2; is constant and we do not have a two-parameter family. 


THEOREM 11. If in a normal congruence of central conics the locus of the 
centres is a surface which at each point is orthogonal to the plane of the corre- 
sponding conic, or is a curve, or is fixed, the congruence is generated by a series 
of conics which are immovable in a moving plane. 


There remains the case where the first two equations (17) are not propor- 
tional. We may assume here § =£ =0. 
Let us write 


de b? 


log log ——————_ = B 


(6? — (6? — 


From (IV) and (V) 
b? — g2)1/2 b? — g2)1/2 


We now change the variables 1; and v2 to such functions of them that 


Further, let 


From (II) and (III) 


(b? — 
- 
Ow, OWe 
an, 1, 2, 
$1(01) = ¥1(w), = we), 
_ — OA _ bh, OB 
0A 0B 
oa, =f. 
Ov, OW, 


376 J. L. COOLIDGE 


(21) 
Ow, 


2 2 2 2 
+5) +o) 


Ow, OWe 


(22) 0. 


I confess, to my shame, that I have not been able to make much progress 
towards solving these equations, or discovering their geometrical significance. 
In spite of that I still think that the method here outlined is the most promis- 
ing for studying the problems indicated. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


be 


ON CIRCAVARIANT MATRICES AND CIRCA- 
EQUIVALENT NETWORKS 


BY 
RICHARD STEVENS BURINGTON 


1. Introduction. In recent papers(') the author has considered various 
questions concerning the equivalence of quadrics in m-affine n-space and re- 
lated problems in the theory of (absolutely) equivalent m-terminal pair elec- 
trical networks. 

The present paper is concerned with the development of certain theorems 
relating to the theory of congruent matrices which appear to be fundamental 
to the construction of a somewhat more general theory of (relative) equiva- 
lent electrical networks. 

Consider the set of matrices B congruent to the matrix A;i.e., B=P’AP. 
In the first section of this paper a theory of circavariant matrices is initiated, 
general theorems being obtained relating to the restrictions which must be 
imposed on P in order that one or more of a certain set A;, A?,--- of mat- 
rices derived from A each be circavariant. In later sections theorems on the 
congruence of matrices with P in a modified m-affine space are obtained, to- 
gether with a set of normal forms. 

In the last section, the theory of circavariant matrices is used to initiate 
a general theory of circa-equivalent networks, the usual theory of equivalent 
networks appearing as a special case of the general theory. 

2. Congruent and circavariant matrices. Let A, B, C,---, P,Q, A,--- 
be matrices of order » whose elements belong to a field §. The matrix B is 
said to be equivalent(*) to the matrix A if there exist nonsingular matrices P 
and Q such that B=QAP. The matrix B is said to be congruent to A if there 
exists a nonsingular matrix P such that B=P’AP. 

Let Cy::7{ denote the matrix obtained from C by deleting from C rows 
columns - - , Denote by C,,..-r,. 

Consider the set % of all matrices A of order m whose elements range 


Presented to the Society, November 25, 1938, under the title On the congruence of matrices 
and associated circavariant matrices; received by the editors December 4, 1939. 

(*) Burington, Richard S., On the equivalence of quadrics in m-affine n-space and its relation 
to the equivalence of 2m-pole networks, these Transactions, vol. 38 (1935), pp. 163-176; hereafter 
called paper [1]. 

Burington, Richard S., Matrices in electric circuit theory, Journal of Mathematics and Phys- 
ics, vol. 14 (1935), pp. 235-249; hereafter called paper [2]. 

Burington, Richard S., R-matrices and equivalent networks 1, Journal of Mathematics and 
Physics, vol. 16 (1937), pp. 85-103; hereafter called paper [3]. 

(?) Throughout this paper it is understood, unless otherwise stated, that all the elements 
of all matrices used belong to a commutative field § whose characteristic is not two. 


377 


378 R. S. BURINGTON [November 


over §. With each A associate the set % of all matrices congruent to A. If B 
is any matrix of set %, there exists a nonsingular matrix P such that 


(2.1) B = P’AP. 


If T ranges over the set ¥ of all nonsingular matrices of order n, then the mat- 
rices 8=T’AT are all congruent to A, and hence, each 8 belongs to B. The 
matrix P in (2.1) belongs to . If (2.1) holds and if there exists a subset 
of % such that for all matrices A of A, and for all matrices P of §., 

(2.2) 

then Aj"; is called a circavariant matrix of A under the congruence (2.1). 
Let %, denote the subset of 8 obtained by letting P range over f.. 

Thus, if (2.1) holds and if A: is a circavariant matrix, then B, can be ob- 
tained directly from the product B,= Pj A,P,, or from B by deleting the first 
row and column. 

The term circavariant matrix has been introduced here to avoid confusion 
with the term invariant matrix as used by L. Schur, Littlewood and other 
writers, as in D. E. Littlewood, The construction of invariant matrices, Pro- 
ceedings of the London Mathematical Society, (2), vol. 43 (1937), pp. 226— 
240. In paper [1], a circavariant matrix was called an invariant matrix. In con- 
trast to the definition used in [1], the present definition places greater em- 
phasis on the requirement that (2.2) hold for all matrices of %. While in [1] 
P was restricted to (simply) m-affine types, here P is not so constrained. 

3. Conditions that A‘: be circavariant. In paper [1] a system of in- 
teger, matric and algebraic invariants of the matrix A of the n-ary quadratic 
form F was exhibited, under the simply m-affine nonsingular group of linear 
transformations T, by means of which necessary and sufficient conditions for 
the simply m-affine congruence with respect to T of two matrices A and B, 
as well as the equivalence of the two corresponding forms F and G, were given. 

Whereas in paper [1] we were concerned with the nature of the matrix A 
for given simply m-affine matrices P, in the present paper we are concerned 
as to the content of the subset {., that is, as to the conditions which must be 
imposed on P in order that A}::'* be a circavariant matrix of A for the class 
W under (2.1). We shall see that the solution to this question leads to a more 
general type of matrix P than that used in paper [1]. 

To begin with, we search for conditions on P in order that A; be a circa- 
variant matrix. In other words, with reference to congruence (2.1), under 
what conditions is the matrix M =P A;P; identically equal to B, in the ele- 
ments of A? 

For convenience, we number the first row (and column) of A; (Pi, M 
and B;) as 2, the second as 3,---, the (n—1)-th as m The rows (and col- 
umns) of A (P and B) are numbered in the usual way, the first row as 1, the 
second row as 2, --- , the mth row as nm. Evidently, 


1940] CIRCAVARIANT MATRICES AND NETWORKS 


(3. 1) M= > > = (mj), 


em? 


where the element in the jth row and kth column of M is mx, with rows and 
columns numbered (as agreed above) j, k=2,---,m. Also from (2.1) 


(3.2) B= > = (bj), 


r=1 


where b,;, is the element in the jth row and kth column, (j, k=1, 2,---, 7). 
In order that B, be identical to M in the elements of A, we must have 


(3.3) = PrjGrePek = > Pri@rePek = Mik, 


r=l1 


in the elements of A, for j and k any fixed pair of integers selected from 
2,°-+,m. From (3.3) with j and k fixed, (j, k=2,3,--- , ), we find that the 
following identities in the elements of A must hold, 


= 0 
PriOripik = 0 
The cases j=k with s=1 give 1j@up1;=0 in au, (j=2, 3,---, m), hence pi; 
must vanish for j=2, 3, --- , With p1;=0, (j=2, - - - , ), all the identities 


(3.4) are satisfied and A; is a circavariant matrix. Since P is nonsingular, 
Hence 


(3.4) 


THEOREM 3.1. A necessary and sufficient condition that A, be a circavariant 
matrix is that 


A matrix of the form 


0 


with elements in § is said to be m-affine(*). If pu=p2= +--+ =Pmm=1, S is 
said to be simply m-affine. 


(*) In paper [1], the term m-affine means simply m-affine. 


379 
0 


380 R. S. BURINGTON [November 


The matrix B is said to be m-affine congruent to A if there exists an m-affine 
nonsingular matrix S with elements in § such that B=S’AS. 

Theorem 3.1 states that a necessary and sufficient condition that A, be circa- 
variant is that P be 1-affine. If B is 1-affine congruent to A, then A; is circa- 
variant and B, is congruent to Ai. 

More generally, suppose that we require that A, be circavariant. Then for 
jand k any fixed pair of integers selected from 1, 2,---,u—1,u+1,---,m 
the following identity in the elements of A must hold: 


(3.6) > = > > Pr 

This means that the following identities in the elements of A must hold: 
= 0 (s=1,2,---,m), 
Pri@ruPur = 0 = 1,2,---, 


(3.7) 


The cases j =k¥u with s=u give pujduupuj =0, so that each p,;=0, W 
conclude 


THEOREM 3.2. A necessary and sufficient condition that A, be circavariant 
1s that pu;=0 for j=1,--+,n;j#u. 


We note that d(P) = p..-d(P.). Since P is nonsingular pu.0, d(P.) #0. 
From (2.2), we conclude that B, is congruent to A,. Hence if A, is circavari- 
ant, B, is congruent to Ay. 

Suppose we require Aj, uv, to be circavariant. Then for j and k any 
fixed pair of integers selected from j=1,---, u—1, u+1,---, m and 
k=1,---,v—1,v+1, we must have 


n n n n 

identically in the elements of A; i.e., 


Puj@usPerk =0 


8 
PriGroP ok =0 


The cases j=k=w, w#v, with s=u and r=v give Puw@uuPuw =0 and 
Prw@vvPow =0, so that pu» = Pow =0. The case j =v, k=u gives 

= 0 

PrvodroPou =0 


(3.9) 


Since P is nonsingular at least one ~,.#0. Hence p,,=0. Likewise, at 
least one p,, #0, so that p,,=0. The case A} leads to the same result. We have 


(s=1,---,m), 


1940] CIRCAVARIANT MATRICES AND NETWORKS 381 


THEOREM 3.3. A necessary and sufficient condition that At, uxv, be a cir- 
cavariant matrix is that 


Pua = 0 

= 0 (8 =1,---,n,B #2). 

This theorem shows that a necessary and sufficient condition that Aj be 
circavariant is that P be 2-affine. 

Evidently d(P) = puupov'd(Puv). Since P is nonsingular, d(Puv) 


~0. Hence from (2.2), Bi, is equivalent to A}. Thus, Bi, is equivalent to Aj 
when A’, is circavariant. 


If Au», uv, is a circavariant matrix, then for j and k any fixed numbers 
selected from w=1,---, ”, w¥u, w¥v, we must have 


r=1 
identically in the elements of A. This means that 
= 9, = 0 (s = 1,2,---,m), 
PriQruPur = 9, = (ry = 1,2,---,#) 
identically in the elements of A. The cases j=k=w=1,---,n, w¥u, wv, 


with s=4, 0 give PuwluuPuw =0, PowlovPow =0. We conclude that Puy = pow =0. 
Since P is nonsingular, PuuPov— PuvPou¥O, and d(Pu») #0. 


THEOREM 3.4. A necessary and sufficient condition that Au» be a circavarian! 
matrix is that Puw=Pow=0, for w=1,--+,n, wHU, WH. 


(3.10) 


This theorem shows that a sufficient (but not necessary) condition that 
Ax be circavariant is that P be 2-affine. 

Let J be the set 1,---, m and let m, u%,---, u, be any subset U of J, 
all the elements of U being distinct. Denote by W=J-—U the set J with the 
elements of U removed. If 7 is not in W, we write r¢ W. 

If Au,...u, is a circavariant matrix, then for j7 and k any fixed numbers 
selected from W, we must have 


> > = > Pr 

r=i r emis Cw 
identically in the elements of A. This means that 
= 0,° PugiPugsPer = O (s = 1,2,---,m), 
Ps = PriGrugPugk = O (r= 1,2,---,m), 


identically in the elements of A. The cases where j= and j ranges over W 
with U2, , Uy give 


(3.11) 


Pur = 0, uri = Pugi@ugugP uo i = 0, 


382 R. S. BURINGTON, [November 


from which we conclude that =Pu,j=0, for j ranging over W. 


THEOREM 3.5. A necessary and sufficient condition that Ay,...u, be circa- 
variant is that pu,j= Puri = =Pu,j=0 for j ranging over W. 


From P delete all the rows and columns whose numbers belong to the set 
W, leaving the matrix Py. It is easy to see that d(P)=d(Pw)-d(Pu,...u,). 
Since d(P) #0, Pw and P,,...u, are both nonsingular, so that By,...u, is con- 
gruent to A,,...u, when the latter is circavariant. 

Let (11, Us, and (%, v2, , v4) be two subsets U and V, respec- 
tively, of the set I of integers 1, 2, - - - , m. Suppose that all the elements of U 
and V are distinct. Let L= U— V be the set I with the elements of U and V 
removed. Then 


THEOREM 3.6. A necessary and sufficient condition that Aj, (or Apts! 
be circavariant is that P be such that for each u in U, each vin V, and for eachr 
in L, Pur = Puv = Pou = 0. 


The proof of this theorem may be obtained from the proof of Theorem 3.3 
as follows: equations (3.8) must hold with u ranging over U and v ranging 
over V. The cases j=k=X with s=u, r=v, u ranging over U and v over V, 
lead to the conditions p,, =0. The cases j =v, k=u, with u over U and v over 
V yield pu» =0 and po, =0. 

It is easy to see that BY’, is equivalent to Au. “t, in case the latter is circa- 
variant. For example, if > rm circavariant, P is such that 


d(P) =d(E)-d(F)-d(P%2) 


where 
par pas Pas 
Since P is nonsingular so are E, F, and P%. From (2.2), BY is equivalent 
to Aj. 

In a similar manner further theorems concerning the circavariance of 
Aj,..%, for the case when the sets U and V overlap can be stated together 
with theorems concerning the equivalence of Bi"), and Ajit. 

4, Invariants. From (2.1), d(B) = [d(A) ] [d(P) i, "Hence, the determinant 
of A is a relative invariant of the set B under (2.1) with P ranging over the 
set 3. Since each P in ¥ is nonsingular, the rank of B is equal to the rank of A, 
so that the rank of A is an integer invariant for the set B. If the field § is 
ordered, the signatures (when defined) of B and A are equal, so that for or- 
dered fields, the signature of A is an integer invariant for the set B. 

Suppose in (2.2), P}...,, and P,,...., are nonsingular and that is 
a circavariant matrix. Consider any function G(a;;)=G of the elements of 
Aj\:t which is so related to the same function G(b;;)=G of the elements of 


1940] CIRCAVARIANT MATRICES AND NETWORKS 


Bp: that, in the elements a;;, 
(4.1) G = (a6 # 0), 


where a=a(P},..,,) is a function of the elements of P;,..,, only, and where 
B=6(P.,..-,) is a function of the elements of P,,...,, only. Then G is said to 
be a circavariant of the set B with respect to Afi:‘tt. 

If G, Gi, Gi, Ge, - + + , Gi: are circavariants of the set 8 with respect to 
the circavariant matrices A, Aj, As, ---, respectively, then any 
function H of the form 


where the p,’s are real numbers is said to be a composite circavariant of 
the set B. Let H(G)=H denote H with G, Gi,---, Gi: replaced by 
G, Gi, Gi, «++, Gi, respectively. Then H is of the form 


(4.3) = yHé = 0), 
where 


If yé=1, then H is said to be an absolute circavariant of B. If y=56, then H 
is said to be a relative circavariant of B. 

Consider the set Gf!::"}! of all matrices B7i:'*s generated from the circa- 
variant matrix by letting P range over $., with 
nonsingular. Then the rank p7!':’#t of each matrix in B/'::‘}! is equal to the 
rank of 

We suppose that °°, S:). Then (2.2) is an ordinary 
congruence. Suppose the field § is ordered and that a P exists in 2, for which 
B,,...2,i8 a diagonal matrix, so that A,,...-, has a signature ¢,,...¢,. From (2.2) 
it follows that the signature of each matrix in the set %,,...,, is equal to 
Thus, 


THEOREM 4.1. The rank of Aji:::#! is an integer invariant for the set Bp."1. 
If § is ordered, the signature (when defined) of A,,...r, 18 an integer invariant 
for the set B,,...1,. 


If Af::# is a circavariant matrix, then from (2.2) 


From (4.4) it is clear that d(A7::‘7) is a circavariant for the set B. If 


so that 


383 


384 R. S. BURINGTON | [November 


THEOREM 4.2. The determinants of the matrices of the set B;\"::; are circa- 
variants for the set B with respect to If ++, 71) Se), these 
determinants are relative circavariants of B. 


It may be remarked that in case (71, - , =(S1, , St) these determi- 
nants are actually ordinary relative invariants of S7!"'‘;{ under an ordinary 
congruence of transformation matrix P,,...,. 

Evidently the ratio of any two circavariants is a composite circavariant 
for the set B. 

5. Normal forms for A under P m-affine. In the theory of electrical net- 
works the cases when A;, A?, As, -- - are to be circavariant frequently occur, 
leading to the requirement that P be m-affine. We shall accordingly consider 
the normal forms of A under P m-affine. 

In paper [1] the reduction of A to normal forms was indicated, the case 
where m=2 being considered in detail. In particular, the results obtained 
indicate that, when P is simply m-affine, every symmetric matrix A with ele- 
ments in a field § (not of characteristic two) with circavariant matrix 
A,,...,m-1 is m-affine congruent in § to a matrix B in which the matrix - 
B,,...,m—1 is of the form 


0 


and with a parabolic matrix 
(Oo O 

0 

0 
.-0 


the number of nonzero ),’s in By...» being equal to the rank p1...m Of Ay...m. 


5.1 , if = P1...m—-1 — Pi---m 2, 
( ) 0 0 v P1 1 

0 - O ‘ 0 

0 0 b,, 


1940) CIRCAVARIANT MATRICES AND NETWORKS 385 


The parameter b,», is an absolute invariant of A when P is simply m-affine. 

We shall let 1...m=0}'''% denote the signature of A:...m. If the field § is 
real, each positive }; in B,..., can be reduced (by means of a simply m-affine 
P) to 1, and each negative b; to —1. The number of positive 0,’s in By...» 
is (p1...m+01...m)/2 and the number of negative 5,’s is 
the remaining ),’s each being zero. If § is algebraically closed, each nonzero 
b; in B,...m can be reduced to 1. No further reduction of B,...n—-1 is possible 
when P is simply m-affine. For (5.1) and (5.2), we shall denote the reduced 
form of B,...m thus obtained by 5-1, 0,---, 0], a diagonal 
matrix. 

In case § is ordered, we shall agree to regularly arrange(*) the matrix 6, 
this always being possible when P is m-affine. 

Suppose in (5.1), bmm#0. Let P be m-affine with p,,=6,,, (where 5,,=0 
if r~s, 5,,=1 if r=s), except for pmm. If § is algebraically closed, select 
Pmm SO that pim=1/bmm. If is real, let Pmm=1/(bmm)!!? if bam >O and 
Pmm=1/(—Omm)*!? if bmm<O. Then, in case of (5.1) with ban#0, the matrix 
B is m-affine congruent to a matrix C=P’BP in which the matrix Ci,...,m—1 
is of the form 


0 0 


0 


where J,, is 1 if § is algebraically closed; and b,,=1 or —1 if § is real. In the 
latter case, Dmm=1 when Omm=—1 when o1..-m-1 
= —1+401...m, ANd =O when 61. ..m—1=01.. +m: 

As in paper [1], it is now easy to formulate various theorems. For ex- 
ample, Theorem 3.3 of [1] for P m-affine holds without the requirement on 
the parameters bmm and Dj. 


THEOREM 5.1. Let P be m-affine with elements in an algebraically closed 
field §, and let A and A® be two symmetric matrices of order n in § with 


(*) C. C. MacDuffee, Theory of Matrices, Berlin, 1933, pp. 57-58. 


0 
associated circavariant matrices A{ ,-, and A? A necessary and suffi- 
cient condition that A{ .. »-, and Aj. »-, be congruent is that they have the 


386 R. S. BURINGTON [November 


same ranks respectively. If the field § 
is real the additional requirement of the equality of the signatures of. 
ant of, .,my respectively, must be met. 


of) 


Case m=2. If m=2, it was shown in [1] that the symmetric matrix A is 
simply 2-affine congruent to one of the various normal forms f;, fe, --- , fs 
given below and as indicated in Table I (6 regularly arranged): 
(0 1 ) 
0 bes 0 


bir dis 
bar bee 


0 
(0 1 


If P is 2-affine and nonsingular, it is possible through the proper selection 
of pu and ps to reduce certain of the matrices fi, fe, ---, fs yielding new 
matrices g1, gs having the same general form as fi, -- - , fs, respec- 
tively, each g; being 2-affine congruent to f; (i=1,---, 5). The forms 
£1, »Zsare indicated in Table I. Thus, in the case of f,, with is 
fi with by replaced by 1 if § is algebraically closed, and with bz replaced by 1 


00 
00. 
10°. 
01 0 bi 0 * O--+ O din) 
00+0---001 
0 0 ° 
00 00. 
1 O bin 1 
01°. ) 
0 0--- 0 0 
0 1 
0 


1940] CIRCAVARIANT MATRICES AND NETWORKS 387 


or —1 if § is real. No further reduction of f; is possible by a 2-affine P which 
preserves the form g:. The numbers 6;;, 52 in Table I denote 1 if § is alge- 
braically closed, and denote 1 or —1 if § is real. In Case 3, the parameter D2: 
is an absolute invariant. It should be noted that the number of such parame- 
ters appearing in the g,;’s is just one, whereas in the simply 2-affine case there 
were several such parameters, 5;;, in the forms fi, - - - , fs. (See Table I, p. 171, 
paper [1], which may be constructed from Table I, as here given, by deleting 
the 5’s and by replacing each 1 by the symbol ~0, and each f; by gi.) 


TABLE I 
Classification of matrix A for the case m =2 


pu=r—3, r=3,4,-++,n+1 P 2-affine 


Pi~Pi2 p 


Conant 


oorr Oo 


61140 
0 


r—2 $110 
r—3 0 


If § is real, each form in Table I can be subdivided according to the sig- 
natures of Ay, Ai, As. 


The following theorems are now evident: 


THEOREM 5.2. A symmetric matrix A with elements in a field § is 2-affine 
congruent in § to one of the forms gi, - - - , gs, according to the ranks (and signa- 
tures if § is real) of the circavariant matrices A, Ai, Az, A?, Aw as indicated in 


Table 1. A is simply 2-affine congruent to one of the forms f,, - - - , fs as indicated 
in Table I. 


THEOREM 5.3. A necessary and sufficient condition for the 2-affine congruence 
of two matrices A and C whose elements belong to the real field is that the circa- 
variant matrices A, As, Az, Ai, Aw and C, Cy, Ce, C2, Co have the same ranks 
and signatures, respectively, and that in Case 3 with P simply affine (Table I, 
paper [1]) the parameters be and be, for A and C respectively, be identically 
equal. If § is algebraically closed, the above holds without the signatures. 


Case) pr bu | Form 
#2 =2 r—2 522 £1 
#2 =2 r—3 0 
#2  r—2 0 £2 
#2 0 £2 
r—2  r—2 522 #0 £2 
10 #2 322 £0 £2 
11 =2 r+1 gs 
12 =2 =2 r 1 & 
13 =2 =2 r—1 1 & 
15 =2 #2 r—1 gs 


388 R. S. BURINGTON [November 


Case m=m. The reduction of A to normal forms for P m-affine may be 
done in a manner similar to that used above for the case when m=2. 

6. Applications to the theory of forms and geometry. It is a simple matter 
to translate the results of this paper into the language of the theory of bilinear 
forms under cogredient m-affine transformations. 

A geometric study of the locus F=)_7_,>_7_ ,a:ix;=0 in a geometry built 
upon the group of transformations x,=) n, with 
m-affine can also be made. 

7. Relation to linear networks [2], [3]. We consider an m-terminal pair 
bilateral -mesh linear electrical network JM containing (lumped) resistances, 
inductances, and capacitances. Let Ei, be the (complex) e.m.f.’s im- 
pressed on terminal pairs 1,---, m, respectively; Q, and I,, the (complex) 
charge and (complex) current, respectively, in mesh s; R,:, Ls:, Ds: (real num- 
bers), the lumped circuit parameters (resistance, inductance, and elastance, 
respectively) for mesh s if s =, common to meshes s and ¢ if st. The meshes 
are so chosen that mesh s, (s=1,---, m), is the only one which passes 
through the terminal pair s. 

Suppose the Kirchhof equations in complex form for the network are 


(7.1) B{Q} = {E}, 


where B=(b,,), w being the (real) fre- 
quency. 
If B is nonsingular, 


(7.2) {Q} = B{E}. 


Let B-'=C=(c,,). The element ¢,, is called the generalized (complex) network 
admittance; being a transfer admittance between meshes u and » if uv, and 
a driving-point admittance for mesh u if u=v. 

Let Y=(Y,,) be C with all but the first m rows and first m columns de- 
leted. Then 


(7.3) {Q}m = V{E}m, 


where {Q},, and {E},, are the first m rows of {Q} and {E}, respectively. 
The matrix Y is called a characteristic (admittance) coefficient matrix for Nt [2]. 

Let Jt: and Jt, be two m-terminal pair networks of characteristic matrices 
Y® and Y®), respectively. Nz is said to be circa-equivalent to Nt, if there exists 
a real nonsingular diagonal matrix D such that for all values of \, YO = 
D'Y®D. 

It should be noted here that this definition is much more general than the 
one heretofore used in the theory of equivalent electrical networks. The usual 
definition, the one given in papers [1], [2], and [3], is a very simple case of 
the one given in the present paper, being merely the type of circa-equivalence 
for which D is the identity matrix. 


1940] CIRCAVARIANT MATRICES AND NETWORKS 389 


If N, and Nz are circa-equivalent, then the admittances Y”’, (r, s=1, 
-++,m), are relative circavariants. 

Let the diagonal element in the rth row of D be d,, and let 2 =(o,,) where 
o,,=d,d,. If all of the elements in row k of 2 are equal, Nz is said to be relatively 
equivalent to Nt, with respect to terminal pair k. If each element in the kth row 
of 2 is equal to one, Nez is said to be absolutely equivalent to N, with respect to 
terminal pair k. If all the elements of 2 are equal to a number ga, Vz is rela- 
tively equivalent to Nt; and in case go =1, Ne is absolutely equivalent to Nh. 

Consider the set It of all m-terminal n-mesh networks. Let 9t(A) be an 
arbitrary network in Jt having A for a network matrix. For each N in M 
select a possible network matrix. Let Jt range over I. Denote the totality of 
network matrices so found by %. With each N(A) associate the set & of all 
networks Jt(B) whose matrices B are congruent to A, 


(7.4) B = P’AP, 


where P is restricted to the real field. Let % denote the set of all P’s which 
satisfy the above requirements. 

Next, with 9t(A) arbitrary, let 2t(A,) and M(B.) denote the networks, 
having matrices A, and B,, respectively, obtained by opening the mesh k 
of N(A) and N(B), respectively. We select a maximal subset B. of $ such that 
for all N(A) of M (that is, for all A of W), and for all P of B.,B,=P AP, 
for k=1,---, m. In other words, we restrict P to a set $. for which 
Ax, +++, Am are circavariant matrices of A. We denote by the subset 
of 2 whose matrices B, B;, - - -, B, are thus related to A, A;, As, Am- 

By Theorem 3.2 P must be m-affine. From Theorems 3.3 and 3.4 we know 
that Af, (u, v=1,---, m), are also circavariant. 

The characteristic coefficient matrices for 2(B) and M(A) are 

(A) 


(7.5) =(¥e), 
where 


= (—1) 


re 
d(B) d(A) 


(r,s =1,--+,m). 


If P is m-affine, by Theorem 4.2 we know that the determinants in Y are 
circavariants of the set 2 with respect to Aj. In fact, the admittances 


(- = fn (r, $= 1, m) 
d(P)-d(A)-d(P) 
are all relative circavariants. Evidently, 


(7.6) Y“) = D/Y®D, 


390 R. S. BURINGTON 


where 


This shows that every network N(B) of 2. is circa-equivalent to N(A). 

Case m=2. Those networks of 2. for which ~,,=1 are absolutely equiva- 
lent to R(A) with respect to terminal pair 7; those for which p1: = 221 are 
relatively equivalent; and those for which pu =2.=1 are absolutely equiva- 
lent. By selecting P so that pupe=1, pux1, the transfer admittances of the 
corresponding &, will all be absolutely circavariant, though the driving-point 
admittances are only relative circavariants. 

If we select a subset 2’ of 2 so that A? is circavariant, then Y{4), the trans- 
fer admittance between meshes 1 and 2, will be a relative circavariant. The 
requirement that A? be circavariant makes P 2-affine so that 2’ is 2,, and 
and Y& are also relative circavariants. 

If we select a subset 2, of 2 for which A; is circavariant, then the driving- 
point admittance Y{*) is relatively circavariant, but yi) and Y) are not 
necessarily so. 

A ry 2. of 2 for which A; and A: are circavariant makes the admit- 

tances Y{4) and Y&%) relative ciscavariants, with P 2-affine; A? is then circa- 
variant, so that the admittance Y{¥) is also relatively circavariant. 

Further results. More generally, in the case of m-terminal pair networks, 
if Aw be circavariant, then the admittances pod =, 

, Yo, are relatively circavariant. 

The gqunecal theory of circa-equivalent networks initiated herein will be 
developed in greater detail at a later time. It should be noted that the special 
case when D is the identity matrix yields the usual theory of (absolutely) 
equivalent networks. 


Case SCHOOL OF APPLIED SCIENCE, 
CLEVELAND, OHIO 


THE POSITION OF THE RADICAL IN AN ALGEBRA 


BY 
MARSHALL HALL 


1. Introduction. The papers of Brauer, Nesbitt, and Nakayama(') have 
given a great deal of information about algebras with a radical. These cover 
wide and interesting, but nevertheless special, classes of algebras. This paper 
gives a new approach to the study of the most general class of algebras with 
a radical, starting from the fundamental theorem that every linear associative 
algebra is uniquely decomposable as the direct sum of a semisimple algebra 
and an algebra bound to its radical. Here an algebra is said to be bound to its 
radical (for short: a bound algebra) if the two-sided annihilators of the radical 
are contained in the radical. In the light of this result, further investigations 
on algebras with a radical may be confined to bound algebras. A bound alge- 
bra is largely determined by its radical. In particular (Theorem 3.8) if the 
radical is of order s, the bound algebra is at most of order s?+s+1. 

A combination of the right and left representations of a bound algebra 
on its radical yields a faithful representation of the algebra modulo the two- 
sided annihilator of the radical. To obtain a faithful representation of the 
bound algebra itself (Theorem 3.2) we must adjoin to this representation a 
system of “remnants” comparable to the factor sets used in the extension of 
groups or in the theory of normal simple algebras. 

A bound algebra is composed (Theorem 3.6) of its radical combined with 
three orthogonal algebras of which two are semisimple. The third has a unit 
and is “doubly represented.” 

The final section of this paper is concerned with the problem of construct- 
ing all algebras with a given radical. Some examples are given to illustrate 
different aspects of this problem. 

2. Decomposition of algebras. Let % be an arbitrary linear associative 
algebra and let ® be its radical. 


Presented to the Society, April 7, 1939; received by the editors March 7, 1940. This paper 
was received by the editors of the Annals of Mathematics June 19, 1939, accepted by them, 
and later transferred to these Transactions. 

(*) R. Brauer, C. Nesbitt, On the regular representations of algebras, Proceedings of the 
National Academy of Sciences, vol. 23 (1937); On the Modular Representation of Groups of Finite 
Order, University of Toronto Studies, Mathematical Series, vol. 4, 1937. 

T. Nakayama, C. Nesbitt, Note on symmetric algebras, Annals of Mathematics, (2), vol. 39 
(1938). 

C. Nesbitt, On the regular representations of algebras, Annals of Mathematics, (2), vol. 39 
(1938). 

T. Nakayama, Some studies on regular representations, induced representations, and modular 
representations, Annals of Mathematics, (2), vol. 39 (1938); On Frobeniusean algebras I, ibid., 
vol. 40 (1939); On Frobeniusean algebras I1, to appear shortly; On the structure of symmetric 
algebras and Galois moduli over modular fields, to appear shortly. 


391 


q 


392 MARSHALL HALL, [November 


THEOREM 2.1. Jf a is a left ideal of A, there is an idempotent e such that 
a=(e): +1; where 11 ¢ R and re=0. If b is a right ideal, there is an idempotent 
f such that b=(f),+t2 where t2¢ R and frz=0. If ¢ is a two-sided ideal, there 
is an idempotent g such that where ts, t11¢ R and tag = gry 
=0. 


For the right or left ideals this theorem has been proved by the author(?). 
A two-sided ideal ¢ may be considered as both a left ideal and a right ideal: 


c= (ec): +t, ne = 0, 


(2.1) 
c= (f), + ta, = 0, 


Here f=we+mr, where re=0 and e=fu+re where frz=0. Hence fe=we=fu 
and f—e=n—mne R. If xe (f)-+tre=c, x=fut+s, se (Rn c). Hence x=(e+r)v 
+s=ev+rv+s where rv+se (Rn c), and so xe (e),u (Rn c). But since 
ee c, (e),¢c, whence cc(e),u (Rm c)ccorc=(e),u (Ru c). But (Ra c) 
=e(R nm c)+r3 where er;=0, ¢R. Hence c=(e), u (e(M m c) +13) = (e)-+ts. 
Thus in the two representations of (2.1) we may assume without loss of gen- 
erality that the idempotents e and f are the same, which is the statement of 
the theorem. 


DEFINITION. An algebra Y is said to be bound to its radical R (briefly: Ais a 
bound algebra) if for ce UA, cR=Rc=0 implies that ce R. 


THEOREM 2.2. Any linear associative algebra is uniquely decomposable as 
the direct sum of a semisimple algebra and a bound algebra. 


Let & be a linear associative algebra, and let ® be its radical. The two” 
sided annihilators of ®, elements c such that cR=Rc=0, form a two-sided 
ideal R‘. By Theorem 2.1 there is an idempotent g such that ®'=(g):+ts 
=(g),+14 where ts, 11. R, tag=0, grs=0. If 


(2.2) Y= + As + 


is the two-sided Pierce decomposition of % with respect to g, we have, for 
a;e;, 
841= 01, G2, 0, ga, = 0, 


(2.3) 
ag=a, ag=0, ag =0. 


Consider any Also cogR=0. 
Hence a2 ¢ and so a2=wg+r where re rg=0. Here 0=a.g=wg and 
d2=re Hence a2=ga2=gre gR =O and a2=0. Hence =0. Similarly A; =0. 
Hence (2.2) reduces to 


(?) A type of algebraic closure, Annals of Mathematics, (2), vol. 40 (1939), Theorem 6.1. 
It is evident that the use of a unit in this theorem is not essential. 


1940] THE RADICAL IN AN ALGEBRA: 


(2.4) Y= 


where the sum is direct since in consequence of the relations (2.3) %:%.= WM: 
=0. Since also ¢ As, Ai and are subalgebras of A. Moreover 
the elements annihilated by g on both sides are in Y,, and so R ¢ Ay. If A, had 
a radical it would be part of the radical of %, which is in Y%,. Hence W, has no 
radical, and is semisimple. Moreover § is the radical of %,. Suppose a,e , is 
a two-sided annihilator of R. Then a4 e (g): +13 and a4=wg+r; where rag =0. 
But 0=ag=wg, and so a4=1r3e R. Hence the two-sided annihilators of R 
in M, are in R, and W, is bound to its radical R. Y is the direct sum of the semi- 
simple algebra %; and the bound algebra %,. 
Now suppose 


(2.5) 


is any decomposition of % as the direct sum of a semi-simple algebra 8 and a 
bound algebra ©. The unit h of % is a two-sided annihilator of € and a fortiori 
of Re C. Hence, under the decomposition (2.4) of A, h=hi+h,4 and each of 
hy, hy is a two-sided annihilator of R. But h=h?=h2 +43, whence As 
YW, is a bound algebra, h,¢ R. An idempotent can be in the radical only if 
it is zero, and so hk=0, h=he A, and h=gh=hg. A similar argument shows 
that g=hg=gh, starting from the decomposition of g in (2.5). Combining 
results, we obtain g=h. Evidently (2.5) as a direct sum is the two-sided Pierce 
decomposition of % with respect to # and consequently must be identical with 
(2.4). This proves the uniqueness part of the theorem. 

3. Representation and properties of bound algebras. Let Y& be a bound 
algebra over a field K and m, --- , x,a basis of its radical ®. Then if c is an 
arbitrary element of Y, 


wic = 
j=l 


(3.1) 


j=l 
and we have the right representation of % on ®: 
(3.2) = R(c) 
and the left representation(*) of % on ®: 
(3.3) c— (bij) = L(c). 
Now (3.2) is a faithful representation of &/R* since the c’s mapped onto 
zero are the right annihilators of ®. Similarly (3.3) is a faithful representation 


(*) For the left representation cd—>L(d)L(c). Ordinarily the transpose L(c)7 is used to 
preserve the order of multiplication. But in this paper it seems desirable to leave the representa- 
tion in this form. 


394 . MARSHALL HALL [November 


of A/R*. (Note that MR’ and MR! are both two-sided ideals.) A combination of 
(3.2) and (3.3) is better than either one separately. If we put 


(3.4) c—> [R(c), L(o)],. 
we have the rules of combination 


[R(c1) + R(¢2), L(e1) + L(c2)], 
(3.5) [R(c1)R(c2), L(c2)L(cs) J, 
ke — [kR(c), kL(c)], ke K. 


Here (3.4) subject to the combinatory rules (3.5) is a faithful representation 
of A modulo FR‘. 


THEOREM 3.1. In the representation (3.4) every matrix R(c:) permutes with 
every matrix L(c2). 


This well known theorem on representations is an immediate consequence 
of the associative law c2( = 

To obtain a faithful representation of A we must extend the representation 
(3.4) by the adjunction of R‘. Let 2, - - - , 2, be a basis of R* and extend this 
to a basis of M, , °° » Where p+q=s and finally to a basis 
of WM, th, +++, thy my Zq If, in the homomorphism 
then m, -- - , um form a basis of A/R*. We shall call a; the 
representative of its class in % modulo R‘. To extend this concept of repre- 
sentative, suppose an arbitrary ce W is given by 


m q 


Then the mapping A—A/R', which we may suppose given by (3.4), maps c 
onto an element y=). j=:c;u;. If we now write 


(3.7) 


we call 7 the representative of the class of % modulo ®‘ mapped onto y. 
Hence any c of & is expressible uniquely as 


(3.8) 


where c—y by the homomorphism (3.4) and re R'. 

Let us now suppose the basis of 9 used in (3.4) is chosen to be %, - - - , &», 
Z1,° °°, Zq Since R* is a two-sided ideal, the matrices R(c) and L(c) must be 
of the form 


Rir(c) | Ria(c) | 
(3.9) R(c) ( 0 | L(c) ( 0 ’ 


j=l 


1940] THE RADICAL IN AN ALGEBRA 395 


where Ra(c) and L2(c) are g by g matrices giving the right and left transfor- 
mations induced on ®* by c. If 


q 
(3.10.1) r= rice 
kewl 
is any element of R‘, then 
q q 


where 
h 
(3.10.3) | = Reo(c) | | = 
ty 5q 
THEOREM 3.2, REPRESENTATION OF BOUND ALGEBRAS. Let U be an algebra 
bound to its radical R. Then a faithful representation of X is given by 
(3.11) c= [R(c), L(c), W(o)], 


where R(c), L(c) are given by (3.1)-(3.3) and are of the type (3.9) and W(c) 
=(r,--- 14)’ (the prime indicating the transpose of the vector) is determined 
by r=) Luurece in (3.6). The rules of combination in (3.11) are given by 
ke = [kR(c), kL(c), kW(c)], ke K, 
(3.12) + ce = [R(cs) + R(c2), L(cs) + L(ce), W(cr) + W(c2)], 
= [R(cs)L(c2), L(C2)L(¢1), Rao(c2)W(c1) + + ]. 


Here v2} is determined by 


q 
(3.13) = t+ v2) = Do 


Proof. It is easily seen that (3.11) yields a one-to-one correspondence be- 
tween the elements of % and the symbols [R(c), L(c), W(c)]. For (3.8) ex- 
presses c uniquely as the sum 7+r and 7=y, y=[R(c), L(c)], re2W(c). It 
remains to show that the rules of combination (3.12) are in accord with this 
correspondence. For the sum and scalar product this is evident. For the prod- 
uct 


Co = ¥2 + 12, 
= + + V2 = + + + 


where L(c2)L(c1)] from (3.5) and W(c2), 
by (3.10.3) and r(j, v2} by (3.13). The element 


(3.14) 


396 MARSHALL HALL , [November 


r(¥1, Y2) will be called the remnant of the product 7:1, ¥2. Here rir2=0 since 
(R‘)?=0 because (R‘')? ¢ R‘R =. 


THEOREM 3.3. The remnants r(x, y) of (3.13) satisfy the following relations: 
r(x + y, 2) = r(x, 2) + 2), 
r(x, y +2) = r(x, y) + 1(%, 2), 
r(kx, y) = r(x, ky) = kr(x, ), 
£9r(u, v) = xyr(u, 0), 
r(u, v) £9 = r(u, v) xy, 
&r(y, 2) + r(x, ys) = r(xy, z) + r(x, y)z. 


From the definition of the representative ¥ in (3.7) it follows immediately 
that 


(3.16) kx = kf, keK; 


whence the first three relations are easily derived. For the fourth relation, 
B9r(u, v) =(xy+r(x, y))r(u, v) =xyr(u, v) since (R')*?=0. The fifth relation 
may be derived in the same way. The last and perhaps the most important 
relation is obtained by multiplying out #(92) = (#9). In constructing a bound 
algebra with a given radical it is this last relation which is most difficult to 
satisfy. 


_ If the representatives %,---, @m, are replaced by representatives 
ti, &m in the same classes of modulo then 


£+ a(x), 

where a(x) satisfies the linearity conditions 

(3.18) a(x + y) = a(x) + a(y), a(kx) = ka(x), ke K. 


THEOREM 3.4. If the representatives of the classes of U modulo R‘ are changed 
by the rule (3.17), then the remnants r(x, y) are changed by the rule 


(3.19) y) = r(x, + + — a(xy). 


Proof. or BH+r'(x, or 
ta(xy) +r'(x, y) y) +4a(y) whence (3.19) follows. 

THEOREM 3.5. If €n are orthogonal idempotents in U/R', then the 
representatives 2, +++ , &, may be chosen as orthogonal idempotents. 


The proof of this theorem exactly parallels the proof of Theorem 1 on 
page 16 of Deuring’s Algebren. 


| 


1940] THE RADICAL IN AN- ALGEBRA 397 


THEOREM 3.6. In a bound algebra % there are subalgebras Ai, A, Ua such 
that A=Ai+A-+(Aa u R) with the following properties: 

1. M1, A, Ws have units and are orthogonal. 

2. A, and A, are semisimple and RA, =0, =O. 

3. U/MR is the direct sum of three semisimple algebras isomorphic to A:, A,, 


Wa/(Aan R). 


Proof. R* and ®! are both two-sided ideals and Y itself may be considered 
a two-sided ideal, whence by Theorem 2.1 
= (e1), = + te, MR! = + te, 


(3.20) 
MR! = + = (ec), + ts, = (ce): + te, 


with relations on the t’s as given in Theorem 2.1. 

Put ¢;=e—e€,—é2. Then it may be shown that e¢,e;, i#j, is a two-sided 
annihilator of and that consequently the images of ¢é1, é2, and e;in U/R* are 
orthogonal idempotents. By Theorem 3.5 there exist orthogonal idempotents 
ei, e¢, and ef in the classes of and e¢; modulo Just as in the proof 
of Theorem 2.1 it may be shown that ey, e/, and e’ =e/ +e/ +e may be used 
in the representation of the ideals in (3.20). Stated formally: 


LEMMA. Without loss of generality it may be assumed that the idempotents 
of (3.20) satisfy the following relations : 


(3.21) = = Clg = = = = 


Now put and Here and are the 
units of %,, U,, and Wa respectively, and the orthogonality of these algebras 
is an immediate consequence of the orthogonality of these idempotents. This 
is the first property mentioned in the theorem. 

To prove the decomposition 


(3.22) Y= %+A-+ 


take any x of %. From the relations (3.20) x =ew+/ where ¢ e R, et=0. Here 
ex =ew, x =ex+t. Also ex=ue+s withs e R, se=0 and so exe = ue, ex =exe+s, 
x=exe+s+i=exe+p with pe Hence +p 
= + + + p* = x1 +x,+(xa+p*). Here p* =), in since 
exe; is in the radical. For exe; e R’=(e1):+ 12, where te R, 
te,=0. Hence 0 = exe ;e; = ue; and exe;=t e R. A similar argument holds for 
all the ewxe;, 71. To show that the sum x =x,+7,+(xa+p*) is unique it is 
enough to show that 


(3.23) O = + + (%a + 


implies that x,, x,, and xa+p* all vanish. For 0=x,e:+%,€:+(xa+p*)e; and 
here x,€; =0, = =0, and since p* e p*e, = 0. Hence x, 


398 MARSHALL HALL [November 


=(0. Then 0 = and ex4=0, ep*=0, and so x,=ex,=0. Now 
as x,=0, x,=0, then from (3.23) x4+p*=0. This proves the decomposition 
(3.22). 

For the second property, since A%,=e%e, and eR=0 it follows that 
Similarly RA, =0. 

It remains to show that YW, and Y; are semisimple. If %, contained a nil- 
potent ideal r, then 


cre + rR =r +R 


by the decomposition (3.22) and the orthogonality of Y%,, A,, and Ws. Here 
t+r® would be a nilpotent ideal in & and hence contained in f, whence r 
would be contained in ®. But %, cannot contain any elements of ® since és, 
the unit of Y,, is a left annihilator of R. In the same way it may be shown that 
%, is semisimple. 

Applying the homomorphism Y—-%°/R to the decomposition (3.22) we 
have 


(3.24) WR OU, Ha, 


%; being the image of %;. The sum is direct since the e; and a fortiori their 
images are orthogonal. Since %,; and YW, are semisimple they must be iso- 
morphic to their images. And as RY. v O=Ya. The radical of 
Asis Aan Rand so 


THEOREM 3.7. R>RA,, RUM are faithful representations of A, and A; 


respectively. Neither of the mappings R-RAs and R-AzR maps onto zero 
any element of Ua not in R. Hence R>-RAa is a faithful representation of Aa/b: 
and is a faithful representation of where bi, be R, 
=0. 


Suppose the mapping R— RA, represents some z of %, as-0. Then Rz=0, 
but 4,8 =0, and so zh =0. Hence as a two-sided annihilator of R, z belongs 
to R. But by the preceding theorem Y, contains no elements of #. Hence z=0 
and RRA, is a faithful representation. Similarly R--W,R is a faithful rep- 
resentation. 

If R-RAa represents a ze Aa by zero, then Rz=0, ze R’=(e:)-+h1 
=(€:):+t2. For any we with te R, e Ai, and so for 
ze WR’, ze Ua, z=t &€ R. Hence the only elements of %z represented by zero 
in RRA are elements of R. Those elements mapped onto zero form a two- 
sided ideal b, in ®, and since Rb, =0 a fortiori bj =0. Similarly we may treat 
the representation 


THEOREM 3.8. If a radical §R is of order s, then an algebra UX bound to R is 
at most of order s?+s+1. 


(*) These two algebras may be different. See Example 2 in §4. 


1940] THE RADICAL IN AN ALGEBRA 399 


Proof. Consider the decomposition (3.22) of & and the representation 
(3.11) of &. Since neither A; nor M, contains any elements of R‘, we may 
choose their elements as representatives of their classes modulo ®‘ in (3.11) 
and hence 


c = [0, L(c), 0], cew, 
c = [R(c), 0, 0], cew,; 


whence %; may be called the left represented subalgebra, , the right repre- 
sented subalgebra. In the sense of Theorem 3.7 %4 may be called the doubly 
represented subalgebra. 

Now we appeal to the theorems on fully reducible matric algebras as they 
appear in Weyl’s The Classical Groups, Their Invariants and Representations, 
chap. 3. As %; is semisimple, the representation R--%,R is fully reducible, 
the irreducible components corresponding to the simple algebras whose direct 
sum is If OU, where the are simple, then the representa- 
tion breaks up into blocks of degrees , Zny Zn41 =S—(Qitget +£n), 
the ith block containing a certain number of equivalent irreducible repre- 
sentations of %;, and the last vanishing. The commutator algebra of Y; will 
break up into blocks B,, --- , B,, Bas: where B; is the commutator algebra 
of the ith block, i=1, ---, , and B,4; is the complete g2,; matric algebra. 
From Weyl, page 93, the orders h; and h/ of A; and B; respectively satisfy 

Now every element of %, and every element of Uz not in R has a proper 
representation c=R(c) from Theorem 3.7, and by Theorem 3.1 R(c) must be 
a subalgebra of the commutator algebra of L(c). Hence the order of & does 
not exceed the order of &; plus the order of the commutator of YW; plus the 
order of ®. Hence the order of Y is at most 


(3.25) 


(hs +L) = (es + + ts. 


The g’s and h’s are positive integers (g,41 might be zero) and the sum of the g’s 
is s. Here it is very easy to show that 


i=l 
and that equality holds only when »=1, gi=s, and h,=1 or s*. This proves 
the theorem. This result is the best possible since we could have an algebra 
with ?=0, Y, the complete s? matric algebra and WY, the scalar algebra of 
order 1. On the other hand, when ?+0 the order of & must be less than 
s*+-s+1 and it should be possible to obtain various improvements on this 
theorem. 

4. Construction of bound algebras. In constructing all algebras bound to 


| 


400 MARSHALL HALL | [November 


a given radical ®, we turn to representation (3.11), remembering that from 
Theorem 3.6 we need determine only Y,, %:, Ws separately. When Yq is void, 
equations (3.25) make the construction of the algebra relatively easy. Any 
two semisimple matric algebras which permute with each other and the repre- 
sentations of elements of may be considered the right and left represented 
subalgebras of an algebra bound to ®. The construction of algebras in which 
Wa is not void offers many more difficulties. In the first place %4 is not usually 
semisimple and its right and left representations may be different algebras. 
Moreover the faithful representation of Yq in (3.11) may involve remnants 
which must be chosen to satisfy equations (3.15). 


THEOREM 4.1. Given a nilpotent algebra ® of order s, let the right representa- 
tion of re R ona basis of R be r—R(r) and the left representation be r—L(r). 
If, in the homomorphism R-R/R', rp, then p=[R(r), L(r)] is a faithful 
representation of R/R*. Suppose (1) c=2[R(c), L(c)] is a right-left representation 
of an s by s matric algebra X%' whose radical is R/R'* and that every R(cr) per- 
mutes with every L(c2); (2) remnants r(x, y) are chosen from KR‘ for every pair 
x, y of elements of XU’ such that equations (3.15) are satisfied; and (3) the remnants 
r(pi, p;) are such that p; may be considered representatives of classes of R modulo 
MR. Then (3.11) yields a faithful representation of an algebra bound to R, the 
rules of combination being given by (3.12) and (3.13). 


Theorem 3.2 shows that every bound algebra has a representation (3.11). 
This theorem shows that conversely the symbols (3.11) define an algebra 
bound to ® providing that certain conditions are satisfied. The proof is di- 
rect though a little tedious and will only be sketched here. It must be shown 
that the rules (3.12) and (3.13) actually define an associative algebra and it 
is here that we need equations (3.15) and the permutability of R(c) and 
L(c2). Moreover conditions 1 and 2 assure us that the radical of this algebra 
is ®, and neither more nor less. That Y% is bound to ® is an immediate conse- 
quence of the fact that the elements of & are properly represented on ® 
apart from R‘c R. 

In practise the following theorem is of use: 


THEOREM 4.2. c—R(c) is a faithful representation of U/R*, and the radical 
of [R(c)] is the right representation of R. U/R* modulo its radical is isomorphic 
to A,@Aa. Similarly the radical of [L(c)] is the left representation of R, and 
[L(c)] modulo its radical is isomorphic to Hi:@ Xa. j 


Proof. The matric algebra [R(c) ] is Y/R" since it is a homomorphic image 
of % in which the elements mapped onto zero are those of R’. Suppose T is its 
radical. The elements of 9 mapped onto T and zero form a two-sided ideal K, 
and include and Since T*=0, R*. From Theorem 2.1 K=(e),+1 
with rc R. Now ee Kc KR". Hence (e), is represented by zero and T is the 
representation of r alone. Hence T is the image of 9, and the only elements 


1940] THE RADICAL IN AN ALGEBRA 401 


of &% mapped onto T and zero are %,+%. From the decomposition (3.22) 
%/R" modulo its radical is isomorphic to Y,®@Ys. The proof for [L(c) ] is simi- 
lar. 


EXAMPLE 1. is cyclic: R=(x, x"-"), where x*=0. 


Case 1. A=R. 

Case 2.A¥R and A=R' u R. Here = (e2),-+r. Now =aix-+aex?+ - 
+a,-1x"—! and since =e:, x" =0, it follows that xe: =0 or x. Similarly esx =0 
or x. AsX=R! u R, =0. Now if also xe.=0, then R, which is im- 
possible. Hence xe,.=x. An arbitrary y e U is of the form y=exyez+? with 
te ®R. Let xexyes=bix+ Then put v Here vx =0, 
xv =bex?+ --- +5, From this (v)? ¢ and, since (v)? 
whence (v), is a nilpotent ideal and v e R. Hence v = eqves & e2Re2 = 0 and 
v=0. Hence where ¢ R. Here W=(e2, x, , with ex=0, 
=x. 

Case 3. A¥R' u R. Here A=(e):+1, and, arguing as above, we conclude 
ex=x, xe=0 or x while AY=(e)+R! u R. We distinguish according as xe=0 
or x, and as R' u or (ee, R). 

Case 3.1. 2, R): 


2 2 
= 41, é2 = é2, = €2¢, = 0, 


= 4%, xe, = 0, = 0, = 
Case 3.2. A=(e1, R): 
= = Xx, xe, = 0. 
Case 3.3. A=(e, R): 
e=e, ex = xe = 


Thus in all five cases and for any radical five similar bound algebras will 
exist. Let ® be any nilpotent algebra and e, and e, two orthogonal idempo- 
tents. For any re let ear=r, res=0, er =0, reg=r. Then we might have 


(1) = MR, (2) A= (er, R), (3) A= (ea, R), (4) (eres, R), or (5) A= (er, R). 


EXAMPLE 2. Let have a basis x1, x2, Xs, %4 where x4 X1%3 =0, 
x1x4=0 and x? =x,x;=0 for 


Here 


0010 00 1 
000 0 0001 
000 0 000 0 
000 0 000 0 


MARSHALL HALL 


0 0 
0 
0 


0 


1 
0 0 
0 0 
0 0 


while x; and x, are represented by 0 on both sides. Note that the radicals of 
[R(c)] and [L(c)] are not of the same order. By Theorem 3.1 the elements of 
[R(c)] are of the type 


a by ey fi) 
a dq hy 
0 0 by 
0 0 dq, 
and those of [L(c) ] of the type 
a2 be 
0 he 
0 0 0 
0 0 O 


Here aside from the matrix corresponding to x:, [L(c)] can contain the unit 


matrix I, an idempotent 
0 be boge behe 
0 1 £2 he 
0 O 0 0 


0 
the idempotent I—£, or only one of these. If we use the automorphism 
ue dex, 


“ue 


E takes the simple form 


402 [November 
0 00 0 
0 0°O 0 
’ 
0 00 0 
0 
000 0 
010 0 
E= 
0 0 
00 0 0 


1940] THE RADICAL IN AN ALGEBRA 403 


In case E (or I—E) actually occurs in [L(c) ], then by Theorem 3.1 [R(c)] is 
further restricted to matrices of the form 

(m0 fi 

0 

0 a 0 

00 

Here write 


0 


From here on it is fairly simple to enumerate the possible algebras bound to®. 

Case 1. is void. 

Here A=A-+A:+R and A, and W, are faithfully represented without rem- 
nants. 

(a) &, is void or J. Then YU, is void or any semisimple matric algebra of 
the permissible form. These can be of order 1, 2, or 4. 

(b) &, contains E, I—E, or both. Then &, is void or contains F, I—F or 
both. 

Case II. Aa contains only one element independent of ®. This element 
can be taken as an idempotent e. 

(a) eisa left unit of %. 

A, must be void. If ¢ is a right unit, then Y, is void, and A=Az=(1, MR). 
We may also have, using automorphisms to simplify the form of the matrices, 
I, 0]. Here Az=(e, x, x3) and is void or has an idempotent 
felI—F, 0, 0]. It is also possible that e[I—F, I, 0] and As=(e, x2, x4) 
while 4, is void or has an idempotent f=[F, 0, 0]. 

(b) eisa right unit of 

The possibilities here are similar to those above. 

(c) ¢ is neither a right nor a left unit of W. 

Here ¢ has E or I—E as its left representation and F or I—F as its right 
representation while %, and %; are void or contain the idempotent E, I—E, 
F, I—F not representing e. 

Case III. Aa contains two elements independent of ®. 

Here %, and are both void and A=A%4. A has a unit. A=(1, e, R) where 
i=[I, ; 0] and e=[F, E, 0] or [F, I—E, 0] or [IJ—F, E, 0] or [I-F, 
I-E, 0}. 

Throughout this example, as a consequence of Theorem 3.5, all remnants 
may be taken as zero. 


100 0 
000 0 
F = 
001 0 
000 


404 MARSHALL HALL 


EXAMPLE 3. R=(x1, x2), R?=0, K of characteristic 2. 


In a particular algebra bound to ®, 
12 [/, 7,0], 


where a is not a square in K. Here we may take r(1, 1) =r(1, y) =r(y, 1) =0, 
bur any value r(y, y) =bx,1+-cx: will be permissible under equations (3.15), 
and since K is of characteristic 2 a change of representative will not affect 
the remnants. This is a case in which the center of one of the simple algebras 
of UA/R is inseparable. 


YALE UNIVERSITY, 
New Haven, Conn. 


INTEGERS OF QUADRATIC FIELDS AS 
SUMS OF SQUARES 


BY 
IVAN NIVEN 


1. Introduction. Lagrange proved that every positive rational integer is a 
sum of four squares of rational integers. Our principal result is that in an 
imaginary quadratic field every integer of the form 


(1) a + 206, 6? = — m, 


m being a positive square-free rational integer, is expressible as a sum of three 
squares of integers of the field. Gaussian integers are treated in §3, integers 
of the general imaginary quadratic field in §4; necessary and sufficient condi- 
tions for two-square sums are given in each case. Section 6 treats real quad- 
ratic integers, and §7 interprets some of the results in the theory of Diophan- 
tine equations. 

It will be recalled that the coefficients of quadratic integers are not always 
rational integers. Specifically, if the field is an extension of the rational num- 


ber field by 6 in equation (1), and if m=3 (mod 4), the integers of the field are 
given by 


(2) 
2 


where a and b are rational integers, both odd or both even. This introduces a 
special problem, which is dealt with for imaginary fields in §5. Roman letters 
represent rational integers throughout. 

2. Mordell’s theorem. In this section we prove a theorem which was 
stated by L. J. Mordell [1], and upon which most of our study is based. 


Mordell’s proof contains an omission of such import that a complete proof is 
offered here. 


THEOREM 1. If f(x, y) =ax?+2hxy+by? is a positive binary quadratic form 
with integral coefficients, necessary and sufficient conditions that f be expressible 
as a sum of the squares of two linear forms with integral coefficients, 


(3) f(x, y) = (aix + biy)* + (aex + dey)?, 


are that A=ab—h? be a perfect square and that d=(a, h, b) have no prime fac- 
tor of the form 4n+3 to an odd power. 


To prove that these conditions are necessary, we take equation (3) as our 
Presented to the Society, December 29, 1939; received by the editors December 12, 1939. 
405 


406 IVAN NIVEN 
hypothesis and obtain 


(4) 
It follows that 
A=ab-—-h= (ayb2 


There is no loss of generality in assuming d to be square-free. Let p be a prime 
of the form 4n+3 dividing d, that is, dividing each of a, b, and h. Using the 
theory of the decomposition of an integer into the sum of two squares, we 
note that the first and last equations of (4) imply that p is a divisor of a1, de, bi, 
and be. Hence p? divides a, b, h, and therefore d, which contradicts our hy- 
pothesis that d is square-free. 

Conversely, let us assume that 


(5) 


and that d is divisible by no prime of the form 4n+-3 to an odd power, or, 
what is the same thing, that d is expressible as a sum of two squares of in- 
tegers. Because of the identity 


(U? + V*)(w? + 0?) = (Uu + Vo)? + (Uv — Vu)?, 


and because we are attempting to prove that an equation of the form (3) 
can be set up, we may take 


(6) d = (a, h, b) = 1. 
The gap in Mordell’s argument occurs at this point. He states (page 5), “Now 


ab—h =o, h = — Ag (moda), 
and the solution of the congruence for h gives 
Aoa 
h=- : (mod a) 


a2 
for an appropriate resolution of a as a sum of two integral squares, say, 
a; + as.” 
We shall show that a is expressible in the latter form, with 
(7) = — Aoa; (mod a). 


Let p be a prime of the form 4n+3 which divides ab. Equation (5) shows 
that ab is a sum of two squares; hence the highest power of p dividing ab is 
an even one, say p?*; it follows that 


| 


1940] INTEGERS OF QUADRATIC FIELDS 407 


p*|h, do. 


Equation (6) implies that p?* divides a or b, but not both. Treating every 


prime factor of ab which is congruent to 3 modulo 4 in this fashion, we see 
that we may write 


(8) a=P*A, h=POH, Ay= POA, 


wherein P and Q are odd, prime to each other, and contain only prime factors 
of the form 4n+3; also A and B contain no such prime factors. Equation (5) 


shows that 
(9) | AB = H +, = (H + Asi)(H — Aji). 


Now each prime factor of A is expressible as a sum of two squares in one and 
only one way, so that we have 


(10) A= TE + 9) (es + 


Each of the complex factors in the latter product is a prime in the field R(i), 
the rational number field extended by 7. The unique factorization law holds 
in R(t) so that x;+~y,i divides one of the two factors 


H + Aji, H — Aji 


of AB, and x;—¥;i divides the other. Combining the terms of the product (10) 
according to this distinction, we may write 


(11) A = (A; — Asi)(A1 + Asi) = Ai + As, 


where 


(Ai — Asi)|(H + Ait), + Asi) | (H — Ait). 
Similarly we have 


(12) B = B; + By = — Bei)(B; + Bai), 


these factors dividing H+A,i and H—Ayi respectively. Equations (9), (11), 
and (12) imply 


H + = (Ai — Aoi)(Bi — Boi), 
whence we obtain 
(13) H = A,B, — Ai = — — 
If we write 


a, = PA,, a2 = bh = QB, = 


408 IVAN NIVEN [November 
then equations (8), (11), (12), and (13) imply 


(14) a= + as, b = b; + bs, h= 4b, = debe, Ao 
It follows that 


deh = a2a\b, — 


ah = + aybe (mod a), 
and 


= — a,Ao (mod a), 
which we set out to prove. In fact, equations (14) imply 
(15) deh + ayh — a2Ao 
a a 
and it is easily verified that 
ax? + 2hxy + by? = (aix + diy)? + (aex — dey)?. 


3. Gaussian integers. Let us consider a+2bi, where a and 0 are rational 
integers. We have, for an arbitrary integer f. 


a+ 2b1 = (a + t) + 201 + 


which may thus be considered as a quadratic form in 1 and 7. Mordell’s 
theorem is applicable; first we wish to show that there exists a rational in- 
tegral value of ¢ such that ¢(a+/) —b? is a perfect square. 

First let a be even, a= 2A. We wish to obtain integral ¢ and x to satisfy 


(2A +t) = x, 
which may be written in the form 
(16) (t + A)? — x? = A? + 0B, 


This equation has no solutions if both A and b are odd, but is solvable other- 
wise. 


In case either A or d is odd, we write the solution 
= A?+ t+A—2x=1, 
so that 


(A — 1)? + 
= : 


In our application of Theorem 1, we have (in the case considered) satisfied 
the condition that the negative of the discriminant of the form be a square. 


= 
| 


1940) INTEGERS OF QUADRATIC FIELDS 409 


We now consider the nature of d, the greatest common divisor of t, b, and 
a+t. The above equations show that 


4 (s (A+ 
= ’ ° 
Let p be any odd prime dividing d. It is an immediate consequence of the 
above equation that p divides b, A—1, and A+1. Hence p=1, and d is not 
divisible by any odd prime. 

In case both A and b are even, we write A = 2A1, b= 2b,, and equation (16) 
has the solutions 


t+ 


(Ay — 1) + 


The value of d is now given by the equation 


d = (b, (A; — + bi, (41 + by). 


The argument of the last paragraph applies again to show that d is divisible 
by no odd prime. This completes the discussion when a is even. 
In case a is odd, a= 2A +1, equation (16) is replaced by 


(17) (2t + a)? — 4x? = a? + 487. 
This equation always has rational integral solutions ¢ and x. For if we write 
2t+ a+ 2x = a? + 2t+a— 2x = 1, 
the solution is 
t = A? + 
Again we see that d is divisible by no odd prime, because 
d = (b, A? + b?, (A + 1)? + 3’). 


Recalling the remark after equation (16), we have shown that a+201 is 
expressible as a quadratic form in 1 and i satisfying the conditions of Theo- 
rem 1 provided that not both a/2 and 6 are integral and odd. Hence if these 
conditions on a and 0 are satisfied, the integer a+2bi is expressible as a sum 
of two squares of Gaussian integers. 

Conversely, suppose that the Gaussian integer a+2bi is a sum of two 
squares, 


a+ 2bi = (c + di)? + (e + fi)’. 


so that 


410 IVAN NIVEN [November 


Setting t=d?+f?, we have the result 
(a + t)x? + 2bxy + ty? = (cx + dy)? + (ex + fy)’. 


Theorem 1 shows that #(a+/) —b? must be the square of an integer, and the 
conditions for equations (16) or (17) must be fulfilled for a even or odd, re- 
spectively. But equation (16) cannot be satisfied if a/2 and b are odd integers. 
Hence the Gaussian integer a+20i is not expressible as a sum of two squares 
if a/2 and b are odd rational integers. We have proven the first statement of 
the following theorem. 


THEOREM 2. A Gaussian integer of the form a+-2bi is expressible as a sum 
of two squares of Gaussian integers if and only if not both a/2 and b are odd 
integers. Every Gaussian integer of the form a+2bi is expressible as a sum of 
three squares. A Gaussian integer is expressible as a sum of squares of Gaussian 
integers if and only if its imaginary coordinate is even. 


The last remark is trivial. The second statement is a corollary of the first. 
For if a/2 and b are integral and odd, the integer a —1+207 is expressible as a 
sum of two squares, whence a+ 201 is a sum of three squares, one of which is 
unity. 

4. General imaginary quadratic fields. We now consider integers of the 
form y= a+206 where a and b are rational integers, and 


(18) . 62? = — m, 


m being an integer greater than unity with no square factors. We note that y 
is expressible in infinitely many ways as a quadratic form in 1 and 8, 


= (a + tm) + 200 + 


t being an arbitrary integer. If ¢ can be selected so that this quadratic form 
is expressible as a sum of two squares of linear forms, then y is a sum of two 
squares of integers of the field R(@). 

On the other hand, if a+250 is a sum of two squares, 


a + 2b0 = (c + dd)? + (e + f6)?, 
we set t= d?+f? as before and obtain 
(a + tm) x? + 2bxy + ty? = (cx + dy)? + (ex + fy)’. 


We have shown, therefore, that the integer y is a sum of two squares of 
integers of R(@) with rational integral coordinates if and only if the quadratic 
form [a+tm, 2b, t] is expressible as a sum of two squares of linear forms by 
means of a suitable choice of the rational integer t. Hence Theorem 1 is ap- 
plicable. 


THEOREM 3. The integer a+2b0 is expressible as the sum of the squares of 


1940] INTEGERS OF QUADRATIC FIELDS 411 


two integers of the form c+d, if and only if there exists an integer t such that 
mt? +- at — b? 


1s a perfect square, and such that (t, b, a+-mt) is not divisible by a prime of the 
form 4n+3 to an odd power. 


We now consider the problem of expressing the integer a+2b0 as a sum 
of three squares. The equation 


(19) a+ 260 — (u + 6)? = (tm+ a — u*) + 2(b — + — 
leads us to search for integral values of ¢, u, and v which will make 
(20) (t — v?)(tm + a — — (6b — up)? 
a perfect square. First we set the terms free from ¢ equal to a square, 
v?(u? — a) — (b — uv)? = y?, 
so that 
va + 6? + y? 


(21) u 


To obtain an integer from this expression for u, we set v=b and y=2Y6 or 
y= (2Y+1)bd according as a is odd or even; the integer Y is arbitrary. 
The expression (20) is written 


2 
mt? + ta — u? — mb*) + y? = (v-*), 
q 


and a solution is 

(22) t = q(2py + aq — u*g — mb*q), 
provided 

(23) p? — mq? = 1. 


Note that pt/q is an integer. 
We shall also need to account for the greatest common divisor of the co- 
efficients of the quadratic form on the right side of equation (19), 


(24) (b— uv, a — u? + tm, t — = (b — ub, a — u? + tm, t — 


LEMMA. Let m, 12, - +: , Tr be the primes of the form 4n+3 which divide b. 
Then we can choose y in (21) so that 


(25) u? a (mod xj), j=1,2,---,7. 
First consider a odd, a= 2A —1. Then y=2 Yb, and (21) becomes 
u=A + 2Y?. 


412 IVAN NIVEN 


In this case (25) becomes 
(26) (A + 2¥*)* 2A — 1 (mod 


When Y ranges over a complete residue system modulo 7;, Y? (and therefore 
2Y?+A) takes on $(7;+1) incongruent values modulo 7;. From the theory 
of quadratic residues it follows that (2 Y?+A)? takes on at least [}(;+1) ] 
incongruent values modulo 7;, where [x] has the usual number-theoretic 
meaning, namely, the greatest integer less than or equal to x. Since 
[4(3;+1) ]22 for all primes greater than 5, it is possible to select a value s; 
from the complete system of residues modulo 7;, so that (26) is satisfied for 
all primes greater than 5 provided 


(27) Y = s; (mod mj) >5. 


Since 5 is not a prime of the form 4n+3, we take 7;=3 as the special case. 
In this case, choose Y=0, 1, 2 (mod 3) when A =0, 1, 2 (mod 3) respectively, 
and (25) is satisfied. 

Thus an s; can be found corresponding to each 7; in (27) including the 
case 1;= 3 if it happens to be present, so that values of Y satisfying (26) may 
be found by use of the Chinese remainder theorem. Hence the lemma is 


proven in case a is odd. 
If a=2A, we have y=(2Y+1)b; equations (21) and (25) become 


u=A+1+ 2¥?+4 2Y, 
and 
(28) (A + 1+ 2¥? + 2Y)? 2A (mod 


respectively. Again let Y range over a complete residue system modulo 7;; 
each of the quantities Y?+ Y and 2Y?+2Y+A-+1 takes on $(7;+1) incon- 
gruent values modulo 7;. Thus the expression 


(2Y? + 2Y +A + 1)? 


takes on at least [}(2;+1) ] incongruent values modulo 7;. As in the earlier 
case, a relation of the type (27) is established. In case the prime under dis- 
cussion is 3, equation (28) is satisfied by choosing Y =0, 1, 2 (mod 3) when 
A#=0, 1, 2 (mod 3) respectively. The proof of the lemma is completed by use 
of the Chinese remainder theorem, as in the previous case. 

Having thus chosen a suitable value of u, we note that g in (23) may be 
selected so that it is divisible by b(u—1). For we may set 


(29) q = b(u — 1)Q 
where Q is a solution of 
(30) p? — mb*(u — 1)90? = 1. 


[November 


1940] INTEGERS OF QUADRATIC FIELDS 413 


This is a Pell equation, and is known to have solutions p and Q because 
mb*(u—1)? is not a square. 

It is not difficult to show that the expression on the right side of equation 
(24) has no prime of the form 4n+3 as a factor. For suppose that 7 is such a 
prime dividing b—ub. Equation (29) shows that 7 divides g, and consequently 
equation (22) implies that 7 divides ¢. First, if r divides b, the lemma states 
that m does not divide u*—a, and hence a — u*+1m is prime to 7. On the other 
hand, if 7 divides u—1 but not }, it is clear that 7 cannot divide the expres- 
sion ¢—b? in equation (24). 

We have satisfied the conditions of Theorem 1, and equation (19) may 
therefore be interpreted as follows: 


THEOREM 4. Every integer of the form a+2b0 of the quadratic field R() de- 
fined by equation (18) is expressible as a sum of three squares of integers of the 
field. 


5. A special case. We now examine more thoroughly the fields R(@) where 
the integer m of equation (19) is of the form 4n+3. 


THEOREM 5. In case m=3 (mod 4), the integer a+-b0, with b an odd rational 
integer, is expressible as a sum of two squares of integers of the field if and only 
if 4a+4b0 is expressible as a sum of two squares of integers of the type c+d0 
(see Theorem 3). Also, the integer 4a+4b0, with a and b odd rational integers, 
is expressible as a sum of two squares of integers of the field if and only if 2a+2b6 

*is expressible as a sum of two squares of integers of the type c+d8. 


It is obvious that each of these conditions is necessary for the proposed 
representation. To show that the condition expressed in the first statement is 
sufficient, we assume that 4a+ 400 is a sum of two squares, 


(31) 4a + 460 = (x1 + + (x2 + 


This implies the congruence 


0= + my; mys (mod 4), 


O = + 22+ + y2 (mod 4). 


Hence every one of x1, x2, yi: and yz is even, or every one is odd. Equation (31) 
can be divided by 4 to give the desired result. 
We now turn to the second statement of the theorem and assume that 


(32) 2a + 2b0 = (x1 + 10)? + (x2 + y06)?. 


Since a and b are odd, we obtain the congruences 


2 = i+ + yo (mod 4), 1 = + (mod 2). 


414 IVAN NIVEN [November 


These imply that x; and y; are both even or both odd, and an analogous result 
for x2 and yz. The equation (32) can be divided by 4 to give the desired result, 
and we have proven the theorem. 


THEOREM 6. In case m=3 (mod 4), every integer of the field R(@) is expressi- 
ble as a sum of three squares of integers of the field. 


Because of Theorem 4 we need consider only integers of the types 
(33) a + 06, b = 1 (mod 2), 


and 
a b 
(34) 3°; a = b = 1 (mod 2). 
By Theorem 4 we have 
3 
(35) 4a + 400 = >> (x; + y#6)?, 


j=l 


from which we obtain the congruences 


O = xi + x3 + y2 + (mod 4), 
= + X2¥2 + (mod 2). 


If there were a disparity between x; and y; with respect to 2, these congru-. 
ences would imply 


3 = x2 + x3 + y2 + ys (mod 4), 
0 = + xsys (mod 2), 


which have no solutions in integers. Hence x; and y; are both odd or both 

even, and an analogous argument holds for the pairs x2, yz and xs, ys. Our 

theorem is proven for integers of types (33) by dividing equation (35) by 4. 
, Turning to integers of the type (34), we write the equation 


3 
(36) 2a + 260 = (x; + yf)?, 
j=l 


using Theorem 4 as our authority. This equation implies the congruences . 


2 = xi + + yi yo + (mod 4), 
1 = x1y1 + + (mod 2). 


If x; and y; are incongruent modulo 2, we would have 


is + 45 + + Vs (mod 4), 1 = + xsys (mod 2), 


1940] INTEGERS OF QUADRATIC FIELDS 415 


which are manifestly impossible in integers. Hence x; and y,; are both even or 
both odd; a similar statement holds for the pairs x2, y2 and x3, ys. Dividing 
equation (36) by 4, we have completed the proof of the theorem. 

6. Real quadratic fields. Let 


f(x, y) = + 2hxy + 


be a positive form with integral coefficients. Mordell [2] has shown that f is 
expressible as a sum of five squares of linear forms with integral coefficients; 
also he has shown that f is expressible as a sum of four squares of linear forms 
with integral coefficients if and only if a)—h? is a sum of three squares of 
integers, that is, if and only if ab —h? is not of the form 4"(8s+7). In case the 
expression ab — h? equals zero, the form f is expressible as a sum of four squares 
of linear forms with integral coefficients. 

Let us now consider the field R(m*/?), where m is a square-free rational 
integer greater than unity. The integer a+2bm1/? can be written in the form 


(37) a + 2bm'!? = (a — tm) + 2bm'!? + t(m'/2)?2, 
a quadratic form in 1 and m*/?, If Mordell’s theorems above are to apply, 
we must first inquire whether ¢ can be chosen so that the right side of equa- 


tion (37) is a positive form. The question is whether a positive value of ¢ can 
be chosen so that 


(38) D = (a — mit — Bb? > 0. 


If we define K by the equation 
(39) 3 K = (a? — 4mb?)1/2, 


it is seen that D vanishes when ¢ has the values 


a—-K a+K 
2m j 2m 


Furthermore, if K is real, and if ¢ lies between these values, then ¢ and D are 
positive, and the right side of equation (37) is a positive form; hence the first 
Mordell theorem stated at the beginning of this section is applicable. On the 
other hand, if ¢ equals one of the above values, being real, then D is zero, 
and we apply the last Mordell theorem stated. 


THEOREM 7. The integer a+2bm"!? is expressible as a sum of five squares of 
integers of the form c+-dm!? if and only if the quantity K defined by (39) is real 
and the closed interval 


(40) ( 


2m 2m 


contains a rational integer. 


416 IVAN NIVEN [November 


Since any integer contained in the interval (40) is of necessity positive, 
it is clear that these conditions are sufficient. Conversely, let us assume that 
there exist integral values x;, y; (j=1,--~-, 5) such that 


5 
a + 2bm'? = (x; + 


j=l 


from which we obtain 


5 4 2 5 
f= 


The equation 


5 
a — 2bm'!? = (x5 — 
shows that a?—4mb? is positive, and consequently K in (39) is real. Consider 
the function 


= — mi? + at — 5’, 


t being looked upon as a continuous variable. Its graph is a parabola. Its 
zeros are the end-points of the interval (40). Furthermore, any value of ¢ 
for which D is positive lies in the interval (40). When ¢ is given the integral 
value we obtain 


D= - (Law). 


jul j=l 


By the elementary theory of inequalities, this is not negative. Hence we have 
exhibited an integral value satisfying the conditions of the theorem. 


THEOREM 8. The integer a+2bm"!? is expressible as a sum of four squares 
of integers of the form c+dm!? if and only if K defined by (39) is real and the 
closed interval (40) contains a rational integer t so that the value of D in equation 
(38) is expressible as a sum of three squares of rational integers. 


This theorem needs no explanation, since it is an immediate extension of 
Theorem 7, obtained by the use of Mordell’s work as outlined at the begin- 
ning of this section. It is also possible to state theorems analogous to the last 
theorem for the situations wherein we wish two-square and three-square 
sums; this would be done by use of Theorem 1 and other work [3] of Mordell. 

7. Consequences in the theory of Diophantine equations. The first state- 
ment of Theorem 2 may be interpreted as follows: 


THEOREM 9. The Diophantine equations 


x? + y? — — w? = a, yw =), 


= 


1940] INTEGERS OF QUADRATIC FIELDS 417 


are solvable simultaneously if and only if not both 4a and b are integral and odd. 


The second statement of Theorem 2 together with Theorem 4 leads to the 
following result. 


THEOREM 10. If a and b are arbitrary integers, and if m is unity or an in- 
teger greater than unity which is not a square, then the equations 


y? + 2? — m(w? + + 0’) = a, 
xw+ yu + = 
are solvable simultaneously in integers. 


Theorem 4 was proven with m a square-free integer, but the proof is 
valid with the less restrictive hypothesis that m be no square. This hypothesis 
is needed to insure solutions for the Pell equation (30). Finally we rewrite 
Theorem 7. 


THEOREM 11. If a and b are arbitrary integers, and if m is any positive in- 
teger, then the equations 


5 2 2 5 
y + (x; my;) =a, =b 


j=l j=l 


have simultaneous solutions in integers if and only if the quantity K defined by 
equation (39) is real and the closed interval (40) contains a rational integer. 


The restriction that m be square-free contained in Theorem 7 is aban- 
doned here because it was not used in the proof. It was included in Theorem 7 
merely because quadratic integers are defined in terms of a square-free ra- 
tional integer. 

REFERENCES 

1. L. J. Mordell, On the representation of a binary quadratic form as a sum of squares of 

linear forms, Mathematische Zeitschrift, vol. 35 (1932), pp. 1-15. 


» , A new Waring’s problem, Quarterly Journal of Mathematics, vol. 1 (1930), 
pp. 276-288. 

3. ———, On binary quadratic forms, Journal ftir die reine und angewandte Mathematik, 
vol. 167 (1932). 


UNIVERSITY OF ILLINOIS, 
Ursana, Ill. 


= 


ON FINITELY MEAN VALENT FUNCTIONS. I 


BY 
D. C. SPENCER 


1. We suppose f(z) is regular in | 2| <1 and denote by W the Riemann 
domain which is the transform of ‘|z| <1 by f. We shall say that f(z) has 
valency p if f(z) takes no value w more than p times. More generally, let 
W(R) be the area (regions covered multiply being counted multiply) of that 
portion of W which lies in the circle | w| < R; then, if , 


(1.1) W(R) S prR? 


for all R>O, where p is a positive number (not necessarily integral), we shall 
say that f(z) is p mean valent (p.m.v.)('). This paper is a sequel to one of the 
same title to appear shortly in the Proceedings of the London Mathematical 
Society(*) in which I have shown that many of the known theorems concern- 
ing p-valent functions may be extended to the wider class of p.m.v. functions. 
I discuss here the behavior of p.m.v. functions on paths tending to points 
on the circumference |z| =1. 

The theorems which I discuss here remain true under hypotheses some- 
what less restrictive than the one stated above. For example, the hypothesis 
that W(R) S prR? only for R= Ro>0 would suffice (constants now depending 
on Ro as well as p). Furthermore, slightly less precise versions of the theorems 
(with p replaced by p+) could be stated subject to the still weaker condition 


that 
i W(R) < 
im su 
wR? 
Certain theorems(*) proved elsewhere, however, require the full strength of 
(1.1) for all R>O, and for this reason I have not introduced a new definition 
here. 

2. We begin by expressing the inequality (1.1) in a form more convenient 
for our purpose. Let n(r, w) be the number of times (necessarily bounded by 
a constant depending on 7) that f(z) takes on the value w in | z| <r; and let us 
take 


Presented to the Society, April 27, 1940; received by the editors October 13, 1939, and, in 
expanded form, April 4, 1940. 

(+) This definition was suggested to me by Professor J. E. Littlewood, to whom I am also 
indebted for advice in the preparation of the paper. 

(?) This paper will be referred to as Vi. 

(*) For example, Theorem 1 of V;. The complication of an additional parameter Ro is 
avoided thereby as well. 


418 


FINITELY MEAN VALENT FUNCTIONS 
(2.1) R) = f dv, 
(2.2) = p(1, R) = p(r, R). 


Since p(r, R) is an increasing function of r, p(R) exists (but may be infinite). 
We have 


W(R) = lim J Re#)RARGY 


r—1 


-f ‘(im lim — Rew)av) d(xR?) 


= p(R)d(rR*), 
Hence the hypothesis (1.1) may be expressed in the form 
(2.3) J pee 


3. We shall make frequent use of the following lemma: 
LEMMA 1. Suppose 5:2 52. Then the hypothesis 


(3.1) p(R)d(R") < pR’ 


implies 


(3.2) pR: (Ri > 0), 


but not conversely. 


Making some trivial transformations of variable, we see it is enough to 
show that, if s21, 


Ri 
(3.3) p:(R)d(R’) S pRi (R: > 0) 
0 
implies 
Ri 
(3.4) f p.(R)dR pR, 
0 


where (:(R) = p(R'/*), but that (3.4) does not imply (3.3). 


(R > 0). 


420 D. C. SPENCER [November 


Integrating by parts, we have (dropping subscripts) 


1 (s — 1) 
Ss — pRit+ pRi 
= p(R:) 
by (3.3). 
On the other hand, the converse implication is false. In fact, take 
1, 


0, otherwise; 


(3.5) = 4 


and write R,=”+60, where n is an integer and 0<0@<1. Then 


Ri [n/2] 0, q 
0 p=l 


6, n odd, 
nm even, 
—1)+06, odd, 
S 3R.. 


Hence (3.4) is satisfied with p=}. But 


(2v)* y 


= + + > 


if s>1 and v>v9(s). Thus, if s>1, (3.3) is false for Ri=2v, y>vo. We have 
shown that the converse of the lemma is false for some function p(R), but 
not for a p(R) corresponding to an actual Riemann domain. However, .the 
p(R) of the schlicht function which maps the unit circle on the domain shown 
in Fig. 1 differs as little as we please from the choice (3.5), and for it, there- 
fore, the converse of the lemma is false. 

4. Lemma 1 shows that the hypothesis 


Ri 
(s) ~(R)d(R’) S pR; 


0 


1940] FINITELY MEAN VALENT FUNCTIONS 421 


is the stronger the larger s is. For the sake of completeness I include the fol- 
lowing two theorems (but they may be omitted by the reader if he so desires; 
they have no bearing on the rest of the theory). 


THEOREM 1. If (s) is true for all s>0, and p(R) corresponds to a Riemann 
domain W(‘), then p(R) Sp. 


THEOREM 2. If 
f(z) = a2 + +--- 


is mean p-valent, so is the (generally algebraic(*)) function {f(z*)}/*. On the 
other hand, if k>1, the mean p-valency of the function 


= + + + --- 
does not imply that of the function (of form f;) 


= ats + 2? 


If f(z) is p-valent, then so is { Fe(z"/*) } k, This result and its converse are 
well known when the functions are p-valent(*). 
We take Theorem 1 first, and note that if for a given value of R, Ro say, 


P(Ro) p(i, Ro) > 


then, since p(r, Ro) > p(Ro) as r—>1, there exists a 6>0 and ro =70(5) <1 such 
that, for r>ro, 


(*) The theorem is false if this clause is omitted (and is therefore not trivial). 
(*) {f(*) }* has branch points at the zeros of f other than the origin. In the neighborhood 
of the origin, however, 
If f is mean 1-valent, then f has at most one zero (by the definition of mean 1-valency), and in 
this case, therefore, {f(s*) }¥* is regular in |z| <1 (and so of the form fy). 
(*°) See Vi. 


Za SS 
ZF mG FE 
A a A 
ABBAZZx# 
aA |S Fh ZZ 
Z 
= 
Z 
ZA 
Fic. 1 


422 D. C. SPENCER | 


(4.1) b(r, Ro) > p + 6. 


We show that this cannot happen. 

Suppose it does. Then in the first place p(r, R) is discontinuous at R= Ro, 
qua function of R, for each fixed r>ro. For if it were continuous we should 
have(7) 


1 
lim 2 R)d(R*) = lim f P(r, *Ro)d(x*) = p(r, Ro), 
Rid o so J 9 
which is incompatible with the combination (4.1) and (s) for R=Ro. 

Now let B(r) be the transform of the circumference | z| =r by f(z) (B(r) is 
the boundary of W(r)). Then B(r) is an analytic curve, and crosses the cir- 
cumference |w|=Ry a finite (even) number of times if it crosses it at all. 
If B(r) does not meet | w| = Ro, or meets it only in points, then it is obvious 
that p(r, R) is continuous at Ro. Hence the intersection of B(r) with | w| =Ro 
contains one or more intervals if r>ro. These intervals depend upon 7, but 
by (4.1) the intervals corresponding to any r>ro have positive total length. 
It follows that if r>1ro the plane measure of B(r) is positive, and so the length 
(or linear measure) of B(r) is infinite. This is a contradiction of the regularity 
of f(z) in | z| <1, and proves Theorem 1. 

Next, to prove the first part of Theorem 2, let »(R), :(R) correspond 
respectively to f(z), { f(z*)}/*. Then 


= p(R*). 


In fact, {f(z)}*/* (or branch thereof) maps | z| <1 cut along a radius from 0 
to 1 on a surface S with function (1/k)p(R*); hence {f(z*)}*/* (which maps 
|z| <1 on S covered k-times) has for function 


k-(1/k)p(R*) = p(R*). 
Finally 


Ri Rt 
= f = f p(R)d(rR**) < puR?, 
0 0 0 


by the mean p-valency of f and Lemma 1. This proves the first half of the 
theorem. 
As for the second half, let ~:(R), f:(R) correspond respectively to mh 
= fale). Then 
= 
An argument similar to that given above to prove the negative part of Lem- 


ma 1 now shows that there exist mean p-valent functions f, and arbitrarily 
large R, for which 


(7) Since x* increases practically from 0 to 1 in an arbitrarily small neighborhood of x=1 
when s is large. 


[November 


1940] FINITELY MEAN VALENT FUNCTIONS 423 


Ri 
0 0 0 


if k>1. If, however, ».(R) Sp, then p:(R) Sp, and in this case (in particular 
if f, is p-valent) f; is mean p-valent. 

5. After these preliminaries we now study the rate of growth of mean 
p-valent functions. The method depends on the distortion theory of Ahl- . 
fors(*), a theory which has already been applied by Cartwright(*) to obtain 
an upper bound of M(r, f) (the maximum modulus of f(z) on |z| =r) for 
p-valent functions. By K(a, B, - - - ) we denote a positive number depending 
on the parameters shown explicitly. If it is clear on what parameters K de- 
pends, as often happens, we simply write K. K’s will not necessarily be the 
same in different contexts. 

It is convenient to suppose first that f(z) is regular for | z| <1. We write 
Wo=f(0). Let C(R) be the circumference | w| =R in the w-plane, and let 
E(R) =WXC(R), the set of points common to W and C(R) (so that mE(R) 
=2rRp(R)). Two points of E(R) are considered distinct if they corre- 
spond to distinct sheets of W, even though they have the same projection 
on the complex w-plane. E(R) consists of a finite set of arcs {I,(R) }(), 
(v=1, 2,---, N), where N depends on R (and f). For fixed R; let 7,(R:) 
be the value of r for which B(r) (the transform of | z| =r by f) just touches 
I,(R1) (for the first time). If | wo| <R<R,, at least one arc of E(R) separates 
I,(R1) from wo; if more than one, let I,(R) be the first which is met in de- 
scribing a continuous curve lying in W and connecting wo with a point of 
I,(R1). Let mI,(R) 


THEOREM 3. Suppose that 0<r<1, and that Ri>M(ro, f). Then 


(5.1) = (v = 1,2,---,N(Ri)) 
iin 


M (19,4) 


Take R,= M(r, f), and let I, be any one of the intervals {I,(R:)} which 
is touched by B(r) (there is at least one). Then, if r>1r0, we have by Theorem 
2 (with Ri= M(r, f), v=v(r)) 


om M(rJ) dR 1 
5.2 f s log ————_- + K((ro). 


This formula has been proved in effect by Cartwright [3]. I omit the proof of 


the more general formula (5.1) since no essentially new ideas are involved. 
6. Let 


(*) Ahlfors [1]. 

(*) Cartwright [3]. 

(°) C(R) may not cut B (the transform of |z| =1 and the boundary of W), in which case 
each interval of E(R) is the whole of C(R), and the number of intervals is the number of sheets 
cut by C(R) (zero for large R). 


D. C. SPENCER . [November 


= + +--+, 
We deduce the following theorem from Theorem 3: 


THEOREM 4. If f(z) is mean p-valent and M(r, fx) is the maximum modulus 
of fx.on the circle |2| =r, then 


(6.1) M(r, iD) K(p, Rk) (1 


where 


Kip/k}) = Max (| a;|, | | ). 


Theorem 4 was stated without proof in V;. It is known for p-valent func- 
tions, the case k = 1 having been proved by Cartwright (loc. cit.) ; and the gen- 
eral case is an easy deduction from the case k =1 when f; is p-valent(!!). By 
combining Theorem 4 above with Theorem 3 of V; we obtain the following 
theorem (also stated without proof in Vi): 


THEOREM 5. If fi (z) is mean p-valent, then 
| a,| S K(p, 
provided p> tk. 


Theorem 5 for p-valent functions was proved in V;(!*). The restriction 
that p>}k is necessary. In fact, if nm21, take 


if | (mk +1) —2,| S k/2 and d, = mk +1, 
= v(r,) 1/2 

0, otherwise, 
where (A,) is a rapidly increasing sequence, and take a,=1. We suppose that 
the X, satisfy the inequality 


1 1/2 S 1 — 


Then f(z) is zero only at the origin, and each point of the circle | w| <(/6)"/? 
is covered by W, (the transform of | z| <1 by fx) once and only once. Since 
the area of W, is less than or equal to 7).,_,1/v?=2°/6, we see that W(R) 
<7’*R? for R>0, so that f, is mean m-valent. On the other hand, given any 
function y(n) tending steadily to 0 as n—~«, we can choose the X, such that 
|an| >W(n)/n'/? for an infinity of n. This Gegenbeispiel in modified form was 
suggested to me by Professor J. E. Littlewood. 


(4) For then the function f(z) = {f,(2"/*) }* is p-valent. This line of argument is not pos- 
sible here (see Theorem 2). ° 


(#2) The theorem for p-valent functions was known subject to certain restrictions on fx; in 
V; these restrictions were removed. 


424 


1940] FINITELY MEAN VALENT FUNCTIONS 425 


7. We now prove Theorem 4. We define @,(p, R), the function of Theorem 
3, in terms of f:(p, 2), where p<1. Then we define 


0,(R) = lim 9,(p, R) S lim 2xRp(p, R) = 2xRp(R), 
pl 


since f; is mean p-valent. @,(R) is thus an integrable function of R (over any 
finite interval). Now let W; be the transform of |z| <1 by f;. Since rotation 
of W, about the origin through an angle 27/k transforms W, into itself, we 
see that if Ty) is any “tube” of W;, extending to ©, there are (k —1) other tubes 
T,, (v=1, 2, (—1)), each identical to T». If, therefore, 9,(R) is the width 
of 7, measured on C(R), we have 


> @,(R) = kOo(R) < mE(R) = 2x9(R), 


@(R) Rp(R). 


M(r) dR M(r) 1 dR logM(r) @R 
Mire) mir, P(R) R log P(e*) 
(log M(r) — log M(ro))? 
freemen) dR 


logM (rp 


(7.1) 


since (writing =1/p(e*)) 


(6—a)?= (f ve f var 


by Schwarz’s inequality. But 


f -f < 


f i + f f p(R)AR) = aR 
Spt — Ri) 


by the hypothesis of mean valency » and Lemma 1 (with s;=2, se=1). Sub- 
stituting from (7.2) (with Ri=log M(ro), Re=log M(r)) in (7.1) and using 
(5.2), we have 


1 
(7.3) log M(r) (1—9)? + K(p, k, To) + log M (1). 


and so 
Hence 
2a 


426 D. C. SPENCER , 


Theorem 4 will follow at once from (7.3) (with ro= 4, say) if 
(7.4) M (ro, f) < K(p, k, 


To prove (7.4) it is sufficient to show that the family of mean p-valent func- 
tions f, is quasi-normal of order [p/k] at most, and this follows from the defi- 
nition of mean valency » and the form of f;(*). This completes the proof of 
Theorem 4. 

8. The full strength of the hypothesis of mean valency ? is not used in 
Theorem 4; all that is used is (7.2), and this in the form 


R, 
(1) p(e®)dR p/s + p(R2 — Ri), 
Ry 


where s>0, is implied by (s). The hypothesis (1) is, in fact, sufficient for the 
truth of all theorems proved in this paper. Furthermore, only the properties 
of W in the neighborhood of © are relevant. For example, if W(R) S prR? 
only for R>Ro>0, then 


M(r, f) = O(1 — r)-*?, 
where the constant implied in the O depends on Ro, f, and p. More generally if 


W(R 
(8.1) lim sup R Sp, 


- R+0 
then, for every e>0, 
M(r, f) = O(1 — 
Moreover, if (8.1) is satisfied with p=0, then 
M(r, f) = O(1 — r)-*. 


We thus obtain, in particular, the striking result that a schlicht function 
which fills only an infinitesimal part of the w-plane is of infinitesimal order. 

9. We shall say that a set of points in a domain D is a path P if it is a 
Jordan curve. If the equation of P is 


P(t) = x(t) + iy(Z), 
where ¢ varies from 0 to 1, and if, given e, 
| P(t) —a| <e 


for to(€) <t <1, then we say a is an end of P, or that P converges to the point a. 
A path in | z| <1 with end e*® will be denoted by P(@). 


(48) See Montel [5, p. 73]. The test given there for quasi-normality [p/k] is satisfied if 
applied to the functions f,(z"*), which are regular in the unit circle slit along a radius, and this 
implies that the family f;(s) is quasi-normal of order [p/k]. 


[November 


1940] FINITELY MEAN VALENT FUNCTIONS 427 
THEOREM 6. Suppose that f(z) is p.m.v. and that E, is a set of distinct points. 
If to each point 0 of Ee there corresponds at least one path P(6) for which 
(9.1) lim inf (1 — r)* | f(z) | > 0, 
Po 


then 
(9.2) Dd a(6) 29. 


It is sufficient to prove the theorem for an enumerable set (0,)('). It is 
then enough to show that 


(9.3) ay 2p, 


wherea, = a(6,) >0. Under these circumstances there correspond toR > Ro(n, f), 
n arcs I,(R), (1S such that the transform of J,(R) by z=f-'(w) is a 
cross section(") y,(R) of the unit circle separating the point e*® from the ori- 
gin and converging to e* as R-+«(). Let R,(r) be the largest R for which 
7(R) has points in common with the circle |z| =r, and write 


mI,(R) = @,(R) = 27RE,(R). 
Then 


dR R,(r) dR (log R,(r) R;)? 
K 


K @,(R) 
as in the proof of Theorem 4. That is to say, 


log R,(r) l R, — R 2 l R, a. K 2 
f (log R,(r) 1) (log R,(r) 1) 
K 


=> 
=,(e*) Z, dR 


f®%dR/O(R) log 1/(1 — 1)? + K 
by Theorem 3, 
2 $a, log R,(r) + o(log R,(r)), 
by the hypothesis (9.1). This inequality may be written in the form 


R 
< f =,(eR)dR + o(R). 


K 


Summing over v from 1 to 7, we obtain 


(#4) But even a schlicht function may tend to © at a non-enumerable set of discrete 
points 

(5) By a cross section of a domain D we mean a path lying in D (except for its end-points) 
and connecting two distinct boundary points of D. 

(*) That is, given ¢, y»(R) lies in a circle of radius ¢ and center e*” if R>Ro(e). The state- 
ment is intuitive, and in any case is covered by familiar arguments. 


D. C. SPENCER : [November 


n Ron R 
+ 0(R) < f p(e®)dR + o(R), 
K 


K 
since >.,.;2»(R) S$ p(R), 
S pR + o(R), 
by the hypothesis of mean valency p (see (7.2)). Dividing by R and letting 
R- ©, we obtain (9.3), and (since m is arbitrary) this proves the theorem. 
10. THEOREM 7. Suppose f(z) is p.m.v. and that 
(10.1) f(z) = O(1) 
on some path P(00). Then on any path P(6o) 
(10.2) lim sup (1 — r)??| f(z) | = 0. 


We suppose there is an infinite sequence of points, (z,) say, tending to e*% 
and a number K >0 such that | f(zn)| >|f(¢n-1)|, and 


(10.3) | | > K(1 — = 


we argue by reductio ad absurdum. 

Suppose first that there exists an arbitrarily large R such that the trans- 
form of E(R) by z=f-'(w) contains an infinity of nonoverlapping cross sec- 
tions y,(R) of |z|<1 converging to as (1”), and that each y, sepa- 
rates at least one point z,, from the origin. Changing the numeration (if 
necessary) we may suppose that y,(R) separates z, from z=0. Let I,(R) be the 
transform by f(z) of y,(R), and write mI,(R) =9,(R) =2RE,(R), R,=|f(z,)|. 
Then 

(log R, — K)? 


(10.4) plog R, 2x + O(1). 


Otherwise 


= + O(1) <1 


by Theorem 3, and this contradicts the hypothesis (10.3). But (as in the proof 
of Theorem 6) 


R, 
log R, < 
K 


(log R, — K)? 
2x f®dR/O,(R) 


log R, 
sf” zvenar, 
K 


and so, substituting in (10.4), 


(7) But no 7, separates e“0 from the origin. 


4 
428 


1940] FINITELY MEAN VALENT FUNCTIONS 


log Ry 
(10.5) p log R, < f =,(e®)dR + O(1). 


Finally, since y,-1 and 7, are nonoverlapping, we see that Z,_:(R) and 2,(R) 
are distinct for all (sufficiently) large R, and so 


log R, log R, log R, 
f s f — f 
K K K 


< p log R, — p log R,-1 + O(1) 


by the hypothesis of mean valency p and (10.5) for y—1. (10.5) and (10.6) 
give a contradiction if y>vo, and so the infinity of nonoverlapping cross sec- 
tions with the properties stated cannot exist. The alternative is that forR > Ro 
one cross section, y(R) say, separates all but a finite number of (z,) from z=0. 

Now we can find a number R; such that, for R>R:, y(R) does not sepa- 
rate e from z=0. Otherwise there would exist no path P,(0o) on which 
f=0O(1), contrary to the hypothesis of the theorem. Since, on the other hand, 
(R) separates all but a finite number of the (z,) from z=0, we see that, for 
R>R:, y(R) has e* as one end-point. This, I say, is impossible(!*). In fact, 
suppose R; < R2<R;, and connect y(R2z) with y(R;) by a simple analytic curve 
lying in | z| <1. Let gq: be the last intersection of this curve with y(R2), g2 the 
first intersection with y(R;). Then the portion P of the curve connecting q: 
with q lies in a sub-domain D of | z| <1 (bounded by y(R2), y(Rs), and points 
of |z| =1), and divides D into two domains. Let D; be the domain bounded 
by y(R2), y(Rs), P, and e*; and let W, be the transform of Dj, I the trans- 
form of P, by f. Suppose R2<R<R;, and let J(R) be the first cross section 
of W: on C(R) which is met in describing a continuous curve from E(R:) to 
E(Rs) in Wi. We write @(R) =mI(R). Then, by the hypothesis of mean va- 
lency p, 


(10.6) 


Ry 2 
f mI(R)dR S 
R 


1 


Hence, if K=2prR‘/(R:—R), and E£ is the set of values of R in the interval! 
Ri<R<R; for which 


mI(R) > K, 
then 


(10.7) mE < 4(Re— Ri). 
Next, we define 
I(R), if mI(R) = K, 


J(R) = 
(R) portion of of length K measured from II if mI(R) > K. 


(8) For finitely mean valent functions, but not for infinitely mean valent functions. 


430 D. C. SPENCER [November 


Let W,z be one of the sub-domains of W; swept out by J(R) as R varies from 
R, to R;, which contains, as part of its boundary, a set A of boundary points 
of W of positive measure. Such a sub-domain exists by (10.7). Further, W2 is 
plainly a finitely valent domain; and every point of its boundary is accessible 
(by the definition of accessibility). We map W2 on a sub-domain D,; of | z| <1 
by f-!, the set A corresponding to the boundary point e*, This contradicts 
well known theorems on the correspondence of boundaries(!*) and proves our 
statement. 

11. The conclusion (9.2) of Theorem 6 is a best possible one when p is 
integral, as shown by the p-valent function 


(i + 


On the other hand, the hypothesis (9.1) cannot be relaxed to the extent of 
replacing lim inf by lim sup. We have in fact 


= 


THEOREM 8. If (r) is any real function of r satisfying 


(11.1) (1 — 1)? = o(¥(r)), 


then there is a function f(z) regular and schlicht in | z| <1 such that, for at least 
one path P(6,), 


(11.2) : lim sup ¥(r) | f(z) | > 0 
P(6y) 


at an enumerable infinity of discrete points (0,). 
The following theorem shows that Theorem 7 is best possible. 


THEOREM 9. Suppose ¥(r) satisfies (11.1). Then there is a schlicht function 
f(z) such that the radial limit, lim,.; f(re*), exists everywhere and is finite, but 


(11.3) lim sup ¥(r) | f(z) | > 0 
on at least one path P(@o). 


The function whose existence is asserted in Theorem 9 is simpler and we 
discuss it first. We take f(z) to be the function which maps | z| <1 on the 
simply-connected domain W shown in Fig. 2, with f(0) =0 and f’(0) real and 
positive (so that f is uniquely defined by W). W consists of the whole w-plane 
slit along an infinity of concentric circles of radii R,, (v=1, 2,---), each 
annular region (R,, R,41) being connected by a “thin tube” to the interior 
of the circle of radius R;. Every point of the boundary B of W is accessible 
except points on the line extending from w to ©. The line from w to © is an 
infinite prime-end with the single accessible nuclear point (Hauptpunkt) 


(9) See, for example, Carathéodory [2]. 


1940] FINITELY MEAN VALENT FUNCTIONS 431 


w(?°). Let e* be the point corresponding to this prime-end by f-'. The func- 
tion f(z) tends to a finite limit on every radius, the limit being, however, 
unbounded in the neighborhood of e*, We choose (successively) the radii 


Fic. 2 


(R,) of the figure and construct a path P(@o) whose transform by f(z) approxi- 
mates to every point of the infinite prime-end and on which (11.3) is satisfied. 
We show,that the radii (R,) can be so chosen that 


R, + Rosi K 
> 
2 ¥(r,) 


where 1, is the value of r for which B(r) first touches the circumference 
C(4(R,+R,+:)). Then if z, satisfies 


(11.4) 


R, + Ry41 


=r, 


f(z) = 
we have only to connect the z, to obtain the desired path P(@o). For if R>R, 
the set E(R) transforms by z=f~'(w) into a sequence of nonoverlapping cross 
sections (y,(R)), where y,(R) separates z, from the origin if y>yo(R). Since 


See Carathéodory [2]. 


= 


432 D. C. SPENCER [November 


(R) converges to e* as vp &, z, tends to the same point (as y> ~) and 


Ry + 
| flee) | = Vere) >K 


by (11.4). 

If, having chosen R,, we can choose R,,: such that (11.4) is satisfied by the 
function f,(z) which maps (with f,(0) =0, f/ (0) >0) the circle | z| <1 on the 
sub-domain W, of W shown in Fig. 3, it will follow by the subordination prin- 
ciple(#4) that (11.4) is a fortiori satisfied by f uniformly in v. 


Fic, 3 


To show that, for suitable choice of R,+1, f, satisfies (11.4), we cut W, along 
a radius from 0 to a point w of its boundary on | w| =R,, and map the result- 
ing domain by means of s,(z) =o+%i7 =log w on a strip S,. Now for a parallel 
strip U defined by 


$(2) = E+ in, <E < az, 


we have 


1 
(11.5) M(r,$)2 + K(&) 


if &+A<M(r,f)<&—A. But for suitable zo (depending on R), the function 
s,(h(z)), where 

20 

h(z) 

229 — 1 
is superordinate to a U with &)=log R,, &£:=log R,41 and a=1—1/é, (since 
we may make the angular spread of the annular region of W, as near to 2r 
as we please). Hence 


M(r, 2 M(r, s»(h)) K(R,) = M(r, 9) ‘val K(R,), 
by subordination. If Ki(R,)< M(r, [)<log R,41—K4i, this is not less than 
(**) See Littlewood [6]. 


Rosi 


FINITELY MEAN VALENT FUNCTIONS 


1 1 1 
— log — K(R,), 


by (11.5); and is greater than or equal to 


Fic. 4 


& 


if r (and so R,4:) is large enough. In particular, 


log patna = M(r,, s,) 2 log 
2 ¥(r,) 
if R,1>Ro(R,). We can thus choose R,4; so that f, satisfies (11.4), and this 
completes the proof of Theorem 7. 
12. In Theorem 8 let f(z) be the function which maps |z| <1 on the 
w-plane slit as shown in Fig. 4. The domain consists of an infinity of “tubes” 
(numbered as shown) connecting the circle | w| < Ri with «. If we write 


N(n) = 


1940] 433 
<< 


434 D. C. SPENCER [November 


then the vth tube has an angular spread of nearly 27 over the “long” intervals 
(of R): 


(12.1) Ry < R < (n 2 »). 


Using the same notation as in Theorem 9, we see it is enough to show that 
the radii may be chosen successively in such a way that 


R, + K 
2 


The proof is, however, now similar to that given already in the preceding sec- 
tion for the corresponding inequality (11.4), and I omit it. 
13. I add finally a theorem of a somewhat different sort: 


THEOREM 10. Suppose that f(z) is regular in | z| <1, and satisfies the con- 
dition that W(R)<~, 0S R< Let E,(0), E2(0) be the sets of limit points as 
f(z) tends to e*® along two paths P,(0), P2(0) respectively. Then E,(0) X E2(0) 
~0(??). 


- This theorem is related to a well known theorem of Lindeléf(?*) which 
states that, if f is bounded in |z| <1 and tends to limits h, 2, along two paths 
P,(0),.P2(0), then J; =l,. Theorem 10 is false for bounded functions; there exist 
(infinitely mean valent) bounded functions such that, for at least one point 
e®, X E2(6) =0¢%). On the other hand, if F(@) =E£,(0) X E,(@), the hy- 
pothesis “F(@) +0 for all 0” does not imply the finiteness of W(R)(*), so that 
the conditions F(@) #0, W(R)< © are not equivalent. 

In proving Theorem 10 we may plainly suppose that | f | is bounded on 
P,(0), P2(6@) and that P;, P: do not intersect. Then, joining P; to P2 by a path 
Q lying inside | z| <1, we can map the sub-domain of | z| <1 bounded by 
P,, Ps, and Q, onto the unit circle, the paths P:, P, being transformed into 
two arcs, abutting at a point Let L,, Lz be the transforms of 
by f, and let Ai, Az be the projections of Z:, Lz on the w-plane. We suppose 
E,X E,=0, and argue by reductio ad absurdum. 

If £, X E,=0, there exist two positive numbers 6 and 7 such that the por- 
tions of A; and Az corresponding to the arc of |z| =1 which lies inside a circle 
of radius 7; and center e“ are separated by a distance 6. Let.c(r) be that arc 
of the circle of radius r and center e which lies in |z| <1, and let I'(r) be 


(#) That is, Z; and E, contain a common point (which may be ~). 

Lindelof [4]. 

(*) An example is the function f which maps the unit circle on the circle | w| <R covered 
infinitely many times, with winding point at w=0. There is then one point e, and two paths 
P,, P: converging to it, such that the transforms of P; and P: by f are concentric circles. 

(%) In fact, if f maps the unit circle on a Riemann domain bounded by a “spiral” with 
asymptotic point w=0, then f tends to a limit on every path P(6). By coiling the spiral suffi- 
ciently loosely, the sum of the areas bounded by successive loops can be made infinite. 


1940] FINITELY MEAN VALENT FUNCTIONS 435 


that portion of the transform of c(r) which connects A; to A. Let W,, be the 
simply connected domain bounded by ['(n), --- , and subsets B,(r;), 
B,.(r;) of the boundary continua A; and Ay. Now I say no point w is covered 
by W,, more than a finite number of times. For, by the construction of W,,, 
the boundary curves B,(r:), B2(r:) are both simple, and I'(r,) (the transform 
of a portion of c(r1)) is analytic. Hence, if a point w were covered an infinity 
of times, some neighborhood of w would be covered an infinity of times, and 
the area of W,, would therefore be infinite, contradicting the hypothesis that 
W(R) < ~, for finite R (since W,, is a finite domain). Similarly, if W,/ is an 
interior domain, the boundary of which is at distance 6/4, say, from the 
boundary of W,, then the valency of points of W,’ is uniformly bounded by 
a number K. 

Finally, as r—0, I(r) converges to an “end” £ of W(r;) in the sense of 
Carathéodory(*). Since W(r:) is bounded and A(r:), A(r2) are separated by a 
distance 6/2, I'(r) cannot converge to a point or to ©. Therefore, £ is not a 
prime-end, and so cannot correspond to a single point. This is a contradiction 
and proves the theorem. 


REFERENCES 


1. L. Ahlfors, Untersuchungen zur Theorie der konformen Abbildung und der ganzen Funk- 
tionen, Acta Academiae Scientiarum Fennicae, vol. 1 (1930), no. 7. 

2. C. Carathéodory, Uber die Begrenzung einfach zusammenhdngender Gebiete, Mathe- 
matische Annalen, vol. 73 (1912), pp. 323-370. 

3. ML. Cartwright, Some inequalities in the theory of functions, Mathematische Annalen, 
vol. 111 (1935), pp. 98-118. 

4. E. Lindeléf, Sur un principe général de l’analyse et ses applications a la théorie de la repre- 
sentation conforme, Acta Academiae Scientiarum Fennicae, vol. 46 (1915). 

5. P. Montel, Lecons sur les Familles Normales, Paris, 1927. 

6. J. E. Littlewood, On inequalities in the theory of functions, Proceedings of the London 
Mathematical Society, (2), vol. 23 (1925), pp. 481-519. 


(8) See Carathéodory [2]. Carathéodory develops his theory only for schlicht functions; 
here we require its extension to finitely valent functions. The extension is, however, trivial. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY, 
CAMBRIDGE, Mass. 


INTEGRAL SETS OF QUATERNION ALGEBRAS 
OVER A FUNCTION FIELD 


BY 
LEONARD TORNHEIM 


1. Introduction. The theory of rational quaternion algebras suggests a 
corresponding theory for quaternion algebras over a rational function field 
F(z). We can anticipate points of close analogy because of many similarities 
between the set of all rational integers and the set of polynomials F[z]. We 
may also expect results peculiar to each of these theories traceable to certain 
fundamental differences between these two integral domains. 

We find a basis for every integral set S of Q after suitably normalizing a 
basis of Q. When F is the field of all real numbers, canonical bases for both Q 
and S are obtained. We discuss properties of Q which make S be a principal 
ideal ring or not. Conditions are provided for a quantity in F[z] to generate a 
prime ideal in S. Throughout applications are made in the cases for which F 
is either a real number field or a finite field. 

2. Integral sets of Q with characteristic not two. When F has characteris- 
tic not two, a quaternion algebra(') Q has a basis 1, 4, j, 17 with 7?=7, 7?=0, 
ij = —ji. The basis can be chosen as normalized(?) ; that is, o, 7 lie in F[z], are 
relatively prime, and contain no square factors. If F is a Hilbert irreducibility 
field(*), it is possible in addition to take 7 a prime (i.e., irreducible) in F[z]. 

Integral sets(*) S of Q are defined by the usual four properties R, C, U, 
and M. We obtain a basis for an integral set S of Qin 


THEOREM(®) 1. Let Q have a normalized basis. Let r’ be the product of all 
prime factors of r for which o is a quadratic residue, and r'' =1/r' be monic(®). 
Let o’ and o"’ be defined similarly. Then every integral set S of Q has a basis 
over F[z] of the form 1, i, j, w, where 


(1) w = ai/r’ + bj/o’ + ij/o’r’ 


Presented to the Society, April 8, 1938; received by the edicors November 16, 1939. 

(4) A. A. Albert, Structure of Algebras, American Mathematical Society Colloquium Publi- 
cations, vol. 24, 1939, p. 145. 

(*) A. A. Albert, Integral domains of rational generalized quaternion algebras, Bulletin of 
the American Mathematical Society, vol. 40 (1934), p. 166. 

(8) For a summary of results on Hilbert irreducibility fields see A. A. Albert, Involutorial 
simple algebras and real Riemann matrices, Annals of Mathematics, (2), vol. 36 (1935), p. 890. 

(*) L. E. Dickson, Algebren und ihre Zahlentheorie, 1927, p. 155. 

(5) For analogous results for rational algebras see, in addition to the references already 
cited, C. G. Latimer, Arithmetics of generalized quaternion algebras, American Journal of Mathe- 
matics, vol. 48 (1926), pp. 57-66; M. D. Darkow, Determination of a basis for the integral ele- 
ments of certain generalized quaternion algebras, Annals of Mathematics, (2), vol. 28 (1926), 
pp. 263-270. 

(*) A polynomial is monic if its leading coefficient is unity. 


436 


4 
° 


QUATERNION ALGEBRAS 


with a, b any quantities in F(z] satisfying 
(2) 7b? = (0’), =o’? (r’). 
For, by conditions U, C, and R, if £ is in an integral set S, then the traces 


of &, it, jf, and are in F[z]. Thus with the 
x’s all in F[z]. Now N(&) isin F[z] if and only if 


(3) x10 = (r), = (c). 


Since ¢ is not a quadratic residue of any factor of r’’ and 7 is not a quadratic 
residue of any factor of o’’, we see from the congruences (3) that x; is divisible 
by 7’’, x2 by o”’, and xs by o’’r’’. It follows that every quaternion in an in- 
tegral set S lies in a domain (1, i, j, w) over F[z], where w is ‘defined in (1). 

Let r3/0’r’ be the g.c.d. of all the coefficients of ij for quantities in S. Then 
r3/0’r' is a linear combination (with multipliers in F[z]) of the coefficients 
of 4j of a finite set of quaternions of S. Let p be the corresponding linear com- 
bination of the same quaternions. Hence p is in S. Also p=1ro+niit+rej+rw’, 
where the r’s are in F[z], w’ =a'i/r’+b'j/o’+ij/o'r’, and a’, b’ satisfy (2). 
If 7 is also in the integral set S, 7 =yot+y1t+72j +ysw and 4; is divisible by 13. 
Using the fact that N(p+7) must bein F[z], we find that nisin S’ =(1,i,j,w’). 
It is easily verified that S’ satisfies conditions R, C, U. Inasmuch as it con- 
tains the maximal set S, we have S=S’. This completes the proof. 

If o’ has m factors and 7’ has m factors, then there are 2”** pairs of in- 
congruent solutions a, b of (2). The corresponding 2”** integral sets may be 
proved distinct by calculating N(w+w’) for w¥w’. 

The monic quantity d =o’’r’’, although defined by a particular basis, is an 
invariant of the algebra called the fundamental number(") of Q. It is in fact, 
except for a factor in F, the square root of the discriminant of an integral set 
of Q. Every integral set is a maximal order of Q and all maximal orders of Q 
have the same discriminant(*). This implies the invariance of the fundamen- 
tal number d. We proceed to give a direct proof based upon our definition of d. 


THEOREM 2. The fundamental number d=a''r'’ of a quaternion algebra Q 
is an invariant of the algebra. 
For, let Q have a normalized basis, 1, i, j, ij, with 7?=7, 7?=c. If 1, io, jo, 
iojo is another normalized basis, 72 74 =00, then 
to = + + %2, %3) = 1, 


jo = (nt + + Yr Ys) = 1, 
where the x’s and y’s are in F[z]. 


(") H. Brandt, Idealtheorie in Quaternionenalgebren, Mathematische Annalen, vol. 99 
(1928), pp. 1-29; C. G. Latimer, On the fundamental number of a rational generalized quaternion 
algebra, Duke Mathematical Journal, vol. 1 (1935), pp. 433-435. 

(*) M. Deuring, Algebren, 1935, p. 88. 


437 


438 LEONARD TORNHEIM [November 


Let d; be a prime divisor of d=o’’r’’, the fundamental number corre- 
sponding to the basis 1, 7, j, 17. We first assume that d, divides r’’. Then d; 
divides xsye because 49jo+joto=0. Now d; cannot divide both xz and x4; if so, 
we would have on computing 7o 

wir = 0 (di), 
and thus 


an impossibility for d; a divisor of r’’. Similarly d; does not divide both y2 
and 4. 

Suppose that d, divides x2. Then d; does not divide x, and consequently d; 
divides ro. If d; did not divide 7)’ , we would have 


o=c? 


and thus 


2 2 2 
yit + y20 — (ds), 


(4) yo (ds). 


Since (oo, To.) =1, we have (a0, di) =1 and also (c, d;) =1. Noticing also that 
(a, d,;) =1, we see that congruence (4) implies that o is a quadratic residue 
of d,, a contradiction to the assumption that d,; divides r’’. We have proved 
that d; divides 7/’ and hence that it also divides dy) 

If d; divides ye, similar reasoning would show that d, divides a¢’ . 

A parallel proof is used in case we had assumed d; to be a divisor of o’’ 
to demonstrate that d; divides either o¢’ or 74’. 

Hence every prime divisor of d divides dp and, of course, conversely. 
Since d and dp are square-free and monic, d =dp. 

We shall use this lemma of Albert(*). 


LEMMA 1. If in the generalized quaternion algebra Q we replace a by 
(g2—rh*)o with g, h in F(z), we obtain an equivalent algebra. 


In the remainder of this section F is specialized to be the field of all real 
numbers. We apply Lemma 1 to prove 


THEOREM 3. Let F be the field of all real numbers. Then Q over F(z) has a 
basis 1, i, j, 1j, with 12? = —1 and j?=0, where o has leading coefficient +1, is a 
product of distinct linear factors, and is, except for sign, the fundamental number 
of Q. There is a single integral set S and it has a basis 1, i, j, ij. Furthermore 
there is a one-to-one correspondence between the classes of equivalent quaternion 


(*) See footnote 2. 


1940] QUATERNION ALGEBRAS 439 


algebras (including non-division algebras) over F(z) and the square-free poly- 
nomials o in F[z] of leading coefficient +1 containing only linear factors. 


By a theorem of Tsen(?*), there are no normal division algebras of order 
greater than 1 over the field of complex numbers with one indeterminate ad- 
joined. Hence F((—1)'/?) splits Q and we may take i?=—1 since Q con- 
tains(!1) a field equivalent to F((—1)'/*). Now j?=o0 and we may take o 
square-free and in F[z]. The leading coefficient may be taken as +1, since 
if ¢ has leading coefficient a then o/(|a|*/2)* has the desired property. 

If r=2?+2b2+c, with db and c in F, and is positive definite, the discrimi- 
nant dy of r is 4(b?—c) and is negative. Then r is a sum of two squares in F[z]; 


r= +B)? + 


If r divides a, an application of Lemma 1 in reverse when rt = —1 serves to 
remove the factor r from ¢. In this way all positive definite prime factors of o 
are removed, and we can assume now that ¢ contains no such factors. The 
only other irreducible polynomials in F[z] are linear. Hence ¢ is a product of 
linear factors and they are distinct because o is square-free. 

The fact that 1, i, 7, ij form a basis of the integral set S follows immedi- 
ately from Theorem 1, since r= —1 and —1 is never a quadratic residue of a 
linear function of F[z]. Hence the fundamental number of Q is +c. 

Let o have leading coefficient +1 and contain only distinct linear factors. 
If the norm 


(5) xo + x1 — + 


of a quaternion in S is zero, it must be zero for every value taken in F by the 
indeterminate z. Setting z in turn equal to each of the roots of o and using the 
fact that x3 +2? is positive definite, we deduce that both x» and x; are divisible 
by a. Dividing (5) by o and using the same reasoning, we find that x, and x; 
are both divisible by o. Continuing in this way, we find that xo, x1, x2, x3 are 
all divisible by every power of o. This is possible only when a is in F, i.e., 
o= +1. But o¥ —1, for then (5) is positive definite. Consequently when Q 
is not a division algebra, o = 1. 

If Q contains quantities having norms with negative leading coefficient, 
then using (5) we conclude that ¢ is monic; otherwise —¢ is monic. Hence the 
sign of o is determined by the algebra. 

We know then that o is uniquely determined by the algebra since except 
for sign it is the fundamental number of the algebra. 

3. Integral sets of Q with characteristic two. Let the field F have charac- 
teristic two. Then Q has a basis('*) 1, i, j, where i?7+i+a=0, j?=y7, ij 

(}°) C.C. Tsen, Algebren diber Funktionenk orpern, Gottingen Dissertation, 1934. 


(“) M. Deuring, Algebren, 1935, p. 46. 
(#) A. A. Albert, Structure of Algebras, 1939, p. 145. 


440 LEONARD TORNHEIM [November 


=j(i+1), and a@ and y are in F(z). Choose m:#0, mo in F(z) so that 
vYo=ma+ym; is in F[z] and has minimal degree in the set of all quanti- 
ties of that form. Now yox#0 because otherwise the nonzero quaternion 
mo+m2j would be a divisor of zero. Evidently yo is square-free. Whenever 
+yom{’, then yo = (mi hence yo has minimal 
degree in the set of all quantities of F[z] of the form m?+yom/?, where 
mz #0. The transformation 
moij /yme, ji = m+ moj 

replaces y by Yo. 

Let Bo be a nonzero quantity of lowest degree for which the equation 
Xj: =ji(x+ Bo) has a solution with x an integral quaternion. Denote such a 
solution x by rotfititrejitratiji. Necessarily r;=0. Let b’ be the leading 
coefficient of Bo. The transformation 


to = (ro + + roji)/b’, jo= ji 
produces a new basis of Q of the type described in 


THEOREM 4. An algebra Q of characteristic two has a basis 1, i, j, ij where 
i?=Bita, j?=y, ij =j(t+B); a, B, y are in F[z];B is monic and has least degree 
among all nonzero By in F(z | for which the equation xj =j(x-+Bo) has an integral 
quaternion x as solution; and vy is a square-free polynomial and has the least 
degree for all polynomials of the form m2+miy having mo, ms in F(z) and m0. 


A basis of the type given in Theorem 4 will be called a normalized basis. 

When F is perfect we can take y=z. This is implied by a result. of 
Albert(#*). We give here a direct proof. First, y cannot be in F for then 
y1!/2-+-j would be a divisor of zero. Hence y has degree 21. Since y is in F[z] 
and F is perfect, y=ci+cz with and in F[z]. Thus 2= 
and has minimal degree. We have proved part of 


THEOREM 5. When F is perfect, then in Theorem 4 we may take y =z and B 
monic, square-free, and prime to z. 


A value of 8, because of the minimal degree property, is necessarily square- 
free. For, if B=6,p?, then is integral if mo and mz are 
chosen in F[z] to satisfy m3+m3z=a. Furthermore i'j =j(i’+8/p), and B/p 
has degree less than that of 8. These properties of i’ contradict the assump- 
tions made about £. 

In addition, 8 is not divisible by z. Otherwise, if we take ro in F to be the 
square root of the constant term of a, and f2 to be the square root of the co- 
efficient of the linear term of 6ro+a, we have that i’ = (r9+i+12j)/z is inte- 
gral, i'j =j(i'+6/z), and B/z has degree less than that of 6. We have here a 
contradiction to the defining property of 8. 


(8) A. A. Albert, p-algebras over a field generated by one indeterminate, Bulletin of the Ameri- 
can Mathematical Society, vol. 43 (1937), p. 735. 


| 


1940} QUATERNION ALGEBRAS 441 


THEOREM 6. Let Q have a normalized basis. Then every integral set S in Q 
has a basis 1, i, j, w= (x1X%2+%11-+-4%2j+1j)/m, where m is the largest factor of By 
for which there are solutions x, x2 of 


2 2 
=Y, Xe = Bxet+a (m). 
Furthermore m is square-free. 


Let & be in S. By properties R, C, and U, the traces of the quaternions 
ig, are in F[z]. Hence &= with the x’s in 
F[z]. 

Since the denominators of integral quantities divide By, an integral set S 
must have a basis. This basis can be taken in the form 


= €o/By, we = (fo + fii)/By, 
ws = (go + git + g2j)/By, ws = (ho + hii + hej + hsij)/By, 


with the éo, f’s, g’s, and h’s in F[z]. We may assume, since 1, 4, j, ij are all 
in S, that éo, fi, ge; 4s either equal By or else have degree less than that of By 
and the remaining fo, g’s, and h’s have degrees less than D(Gy) (the degree of a 
polynomial a is designated by D(a)). 

Obviously, w: is not integral unless ¢9=6y; w:=1. 

If D(f:) <D(By), then D(T(w2)) <D(8) while =j(we+ T(w2)). This con- 
tradicts the choice of 8; hence D(f,) = D(6y) and in fact Since 
is in S, fo/By is in F[z]; hence fo=0, and w.=1. 

From D(g;) < D(6y), it follows that D(T(ws)) <.D(8) and waj =j(ws+T(ws)), 
a contradiction to the choice of 8 unless g:=0. If D(ge) <D(Gy), then ws; has 
its norm (go/By)?+ (g2/By)*v in F[z] and of degree less than that of y, a con- 
tradiction to the choice of y. Thus g2=$y, and since w;—j is in S, go>=0, so 
that w;=/j. 

Since ij is in S, necessarily hs divides By; By =hsm’ with m’ in F[z]. Now 
is in S. Thus ho, M1, he are all divisible by hs 
and wa=(do+dyi+d2j+ij)/m’ with the d’s in F[z]. In S must be iw, and wij. 
This is possible if and only if 


(6) di=y, dr=dBta, do=didy (m’). 


Now m’ has no square factors. Otherwise, if p? were a divisor of m, p? 
would divide By. If » were a divisor of y, then because d?=¥ (p*), p would 
divide d, and p? divide the square-free y. Hence » would not divide y, so 
that p? would be a divisor of 8. But then 7,=(d:+7)/p would be integral, 
ij =j(t1+B/p), and B/p have smaller degree than 8. This is impossible from 
our choice of 

Our next step is to give a construction of w,. Let m be the product of all 
prime powers 5" dividing By for which there exist solutions of 


LEONARD TORNHEIM [November 


(7) tin =, (p,"). 


By means of a discussion similar to that for m’ we can show that m is also 
square-free, i.e., ¢, =1. Using the Chinese remainder theorem we can find a 
unique solution x,, x: modulo m of the congruences (7) common to all pp. 
Therefore 


The quantity w = (x1%2-+%1i-+%2j +1) /m is integral. Its trace is Bx:/m. This is 
in F[z] since any factor of m dividing y divides x; because of (7) and the re- 
maining factors of m divide 8. Furthermore 


m'N(w) = + + xia + + + 2) 
(x1 + (x2 + 28 +a) =0 (m); 


thus N(w) is in F[z]. 

The quantity w with 1, 7, 7 forms a basis for an integral set S’. The condi- 
tions C, R, and U are easily verified to be satisfied. To show that maximality 
is true only for such a set S’, we need only show that every integral set S is 
necessarily contained in such a set; in fact, only that w, is in some S’. 

Since (6) holds for m’, it is true of every prime factor of m’. Also m’ di- 
vides By. From the definition of m, every prime factor of m’ divides m. Thus 
m’ divides m; m=m'm"'. We can find a solution x1, x2 of (7) for which x; =d;, 
(Pn) whenever is a divisor of m’. Consequently w, is in (1, 1, 7, wm’’) 
which is in S. We have proved our theorem. 

Another form for the basis of Q of characteristic two(") is 1, u1, ue, Ute, 
where 


+ =p (p,0,7 in F(z)). 


Such a basis can be obtained by taking uw, =j, u2=7j. A basis of Q of this form 
can be found which is normalized to have p,¢,7 in F[z], ua quantity with 
norm of lowest degree in the set of all inseparable integral quantities over 
F(z), and uz an integral quantity linearly independent of 1 and ™, inseparable 
over F(z), and having for p a value in F[z] of lowest degree. Using much the 
same reasoning as before we can prove . 


THEOREM 7. An integral set S with respect to a basis 1, u1, Ue, Uyue normalized 
as above has a basis 1, U1, U2, w, where 


(#4) N. Jacobson, p-algebras of exponent p, Bulletin of the American Mathematical Society, 
vol. 43 (1937), pp. 667-670. 


Fr 
» 
442 
2 2 
m=, 


1940] QUATERNION ALGEBRAS 443 


Here ys is determined as one of the quantities of lowest degree for which there 
exists a solution of 


yo + yir + + + p(voys + = 0 (p) 
with the y’s in Fz]. 


4. Factorization when S is a principal ideal ring. Theorems 8 and 9 give 
sufficient conditions for an integral set S to possess a weakened form of a 
Euclidean algorithm. This form of the algorithm, however, is equivalent to 
the algorithm itself for quaternion algebras. 


THEOREM 8. Let Q of characteristic not two have a normalized basis with 
o, T having degrees not greater than 1, and if both have degree 1, then one of them 
being a quadratic residue of the other. Then if 6 is in an integral set S of Q, and 
m is a nonzero polynomial in F\z|, there exists a quaternion x in S such that 
D(N(6—xm)) <D(N(m)). 


A proof of this theorem is easily effected when an explicit basis of S is 
known. 

If o and 7 are both in F, S=(1, 7, j, ij). 

Suppose ¢ is linear and 7 isin F. Were r a quadratic residue of 7, we would 
have r =a? and Q would not be a division algebra. Hence 7 is not a quadratic 
residue of ¢ and S=(1, i,j, ij). The case o in F and f linear is treated similarly. 

Suppose that both ¢ and 7 are linear. If (o| T)= (r| ao) = —1 (this case is ex- 


cluded in the theorem), S=(1, i, j, ij). If however (¢|7) = —(r|o) =1, then 
o =a? (r), with ain F. Hence S=(1, i, j, w), where w is one of i(a+j)/r. The 
case (r| o)=— (o| T) =1 is handled similarly. Finally if (o| T)= (r| o) =1, then 
o =a? (r), (c), with a and in F. Thus o=a?+khkr, b?= —a*/k, and 
1/at+j/bo+ij/or has norm 0; Q is total matric. 

If in Theorem 8 we write the quaternion k= go+qii 
+2j-+qw is found by choosing the polynomials g; to satisfy D(g.—q,m) 
<D(m);i.e., the gq, are the quotients on dividing the g; by m. 


THEOREM 9. Let Q have characteristic two, with y linear and a and B in F. 
If 0 is in the integral set S=(1, i, j, ij) of Q, and m is in F[z], then there exists 
a quaternion x in S such that D(N(6—xm)) <D(m?). 


That S has a basis 1, i, j, ij follows from the discussion in §3 and the fact 
that x?+8x+a= N(x-+%) is irreducible in F when Q is a division algebra. The 
quaternion x is determined as in the proof of Theorem 8. 

When Q has characteristic two and F is perfect, we can take y=z by 
Theorem 5. If in addition 6 is in F, then we can assume 8 =1. We can also 
have a in F. For, since F is perfect, a has the form a?+ az. The degree of a 
is reduced to zero by repeated application of the transformation 


jf=j. 


444 LEONARD TORNHEIM [November 


We then have a basis for this Q satisfying the hypothesis of Theorem 9. 
Theorems 8 and 9 imply the existence of a Euclidean algorithm for the 
integral sets involved(). The presence of such a process assures us that Sis a 
principal ideal ring. 
Whenever S is a principal ideal ring, the following decomposition theorem 


is true. A proof can be made using a procedure developed for rational alge- 
bras(*). 


THEOREM 10. Let S be a principal ideal ring. Let 0 be a quaternion in S not 
divisible by a polynomial in Flz]. If N(0)=pipe--- Pa, where the py are ir- 
reducible polynomials, then + ©, where =p, and is unique 
except for multiplication by units of S on the right, w2,---, Tn are unique 


but for multiplication by units on the right or left, and 7, is unique except for left 
unit factors. 


5. Prime quaternions in S. In this section we seek to determine when a 
quaternion is prime in S. In particular we want to know when a prime in 
F[z] is prime in S. All ideals considered are left ideals. 


THEOREM 11. Let F have characteristic not two. Then a necessary and suffi- 
cient condition that the principal ideal (p) defined by a prime p of F(z] not divid- 
ing the fundamental number d of Q be divisorless in S is that there exist no solu- 
tion in F(z] of the congruence 


(8) + ayer =0 (p). 
If (8) holds, let 


(9) F=1+ "i+ + 


and let P be the left ideal (£, »); P is a proper divisor of (p). Also P#¥(1). 
For otherwise 1 with a, B in S, £=a(£) =0 (p), an impossibil- 
ity. Hence (~) is not a divisorless ideal. 

Conversely, suppose there is a left ideal P¥(1) which properly divides 
(p); i.e., P contains a quaternion £ not divisible by p. Necessarily N(£) is 
divisible by p. If p does not divide o’r’, by multiplying & by i, j, or ij if neces- 
sary, we can obtain an element £ whose coefficient xo of 1 is not congruent to 
0 (p). We can find a solution m in F[z] of mo’r’xo=1 (p), mo'r'xo=1+1rp. 
Then £omo’r’ —rp =1+y11+2j+ystj has norm congruent to 0 (p), and the yz 
are in F[z]. Hence congruence (8) has a solution. If p divides o’r’, then from 
the property of such a prime factor we know that (8) has a solution. 

If the ideals of S are all principal, then kp =771, where k is in F and 

(5) H. Rauter, Quaternionenalgebren mit Komponenten aus einem Kérper von Primzahl- 
charakteristik, Mathematische Zeitschrift, vol. 29 (1929), pp. 234-263. 


(#*) C. G. Latimer, On ideals in generalized quaternion algebras and Hermitian forms, these 
Transactions, vol. 38 (1935), pp. 443-444. 


° 


1940] QUATERNION ALGEBRAS 445 


(m1) =(£, p) with & defined in (9). Also 7; is a divisorless quaternion of S be- 
cause its norm is a prime in F[z]. 

It is known(‘”) that only the prime divisors of the fundamental number d 
are ramified in S. 


THEOREM 12. Every prime divisor p of the fundamental number of Q of char- 


acteristic not two generates an ideal in S which is the square of a two-sided prime 
ideal R. 


A proof of this theorem can be made by following the steps in the demon- 
stration of the analogous theorem for rational quaternion algebras by A. 
Spaltenstein(**). Let S, denote the difference algebra S—(p) where ? is in 
F[z]. If p is a prime dividing the fundamental number d of Q, then S, con- 
tains a unique nonzero idempotent element. Using this fact we can prove 
that the radical R, of S, has exponent two and is the only maximal proper 
two-sided ideal in S,. The ideal R in Theorem 12 is the set of quantities of S 
in the residue classes comprising Ry. 


THEOREM(?*) 13. Let Q be over F(z), where F is a finite field. Then no prime 
of F(z] generates a prime ideal in S. 


First, let F have characteristic not two. If a quantity p of F[z] generates 
a prime ideal in Q, it does not divide the discriminant of S, as a result of 
Theorem 12. Then S, is semisimple. Also since (p) is prime in S, S, contains 
no divisors of zero; hence S, is a division algebra and because it is also finite, 
it isa field. Thus —ij =ji=ij (p); whence 2 =0 (p), an impossibility. If F has 
characteristic 2, we may take*y =z. Every quantity in F[z] has the form 
f(2*) +g(2?) -2=f(z)?+¢(z)?-2 and is therefore the norm of f(z)+g(z) -j. This 
completes the proof of our theorem. 

By a result of Eichler(?°), every ideal in S is principal when F is a finite 
field. This fact, together with Theorem 13, gives 


THEOREM 14. When F is finite, every polynomial in F{z], except for a fac- 
tor in F, is the norm of a quaternion in S. 


Asa particular instance we have that every polynomial in F[z] is expressi- 
ble in the form x2 — fx? + (2 —g) (x3 —fx3) where f is a non-square fixed quantity 
in F, g is fixed in F, and the x’s take values in Fiz]. 

Combining the results of Theorems 11 and 14 for F finite and of charac- 
teristic not two, we see that congruence (8) always has a solution if p does 


(17) M. Deuring, Algebren, 1935, p. 84. 

(#8) A. Spaltenstein, Struktur und Zahlentheorie einer Klasse von Algebren, Zurich Disserta- 
tion, 1934, p. 24. 

(4%) For the rational analogue see A. Speiser, “Idealtheorie in rationalen Algebren,” in 
L. E. Dickson, Algebren und ihre Zahlentheorie, 1927, p. 302. 


(*) M. Eichler, Uber die Idealklassenzahl hyperkomplexer Systeme, Mathematische Zeit- 
schrift, vol. 43 (1938), pp. 481-494. 


446 LEONARD TORNHEIM [November 


not divide the fundamental number. It can be shown, however, that if 
(p, ¢) =1, there is a solution of 


—oy?—7r=0 


a more inclusive fact. 

For the remainder of this section we restrict F to be the field of all real 
numbers; hence we can take r= —1. The primes in F[z] are either linear or 
definite quadratic. 

If p is positive definite, p=z*+2rz+s, and the ideal generated by 
(r?—s)'/2+-(z+1r)1 properly contains (p); hence (p) is not a divisorless ideal 
of S. 

If p is linear, p=z—a, and if there is a solution of the congruence (8), 
then evaluating the left member at z =a, we get the necessary condition that 
the polynomial ¢=¢(z) must have a positive value for z=a. Conversely, if 
o(a) >0 and p=z—a, a solution of (8) exists; e.g., x1 =O x2 = 
We have proved 


THEOREM 15. Let Q be a generalized quaternion algebra over the field F(z)’ 
where F is the field of all real numbers. A quantity p(z) of F(z] generates a di- 
visorless ideal in the integral set S of Q with respect to a normalized basis if and 
only if p(z) is linear and the root of p(z) =0 gives o a negative value. 


As a result of Theorem 8 we know that S is a principal ideal ring if @ is 
linear. This and the fact that the product of two norms is a norm give 


Coro ary. If and only if all the monic linear factors of a square-free poly- 
nomial f in F(z] have their constant terms not less than c, then 


[ot (mt (xx in F[z]) 


If and only if all the constant terms are not greater than c, 


f=+t [xo + x1 + (xs + x3)(z —o)] (x, in F[z]). 


If ¢ = —1=r, then the left member of (8) is always positive for any value 
of z. It is never divisible by a linear polynomial. Using this fact and the result 
that S is a principal ideal ring, we obtain 


THEOREM 16. Let F be the field of all real numbers, and o= —1 =r. Then 
every linear polynomial in F|z| is prime in S, and every irreducible quadratic 
polynomial is, except for sign, the norm of a quaternion in S. 

When F is the rational number field, there are some positive definite poly- 
nomials(?"), e.g., 22-++7, which are prime in S with 7?= —1=j?. 


() E. Landau, Uber die Zerlegung definiter Funktionen in Quadrate, Archiv der Mathe- 
matik und Physik, (3), vol. 7 (1904), pp. 271-277. 


. 
Ax 


1940] QUATERNION ALGEBRAS 447 


6. Equivalence of Hermitian forms and left ideals(**). Denote by G the 
integral domain F[z, i]; if the basis of Q is normalized, G is the set of all in- 
tegral elements in the quadratic extension F(z, 7) of the field F(z). Let W 
designate the set of all quaternions x = go+q:1+92j-+9sij with components gq; 
in F[z]; W has a basis 1, j over G. Thus x= p:1+p2j with pi, p2 in G and 

=| 
Pi 
where 7 =y or o according as F has or has not characteristic two. The con- 
jugate of a quantity w of G is written #. 

A left ideal L of W is called regular if it has a basis (called a regular basis) 
W1, W. over G where Wm = Zmit+Zmej (m=1, 2) with the g,,, in G and the determi- 
nant | gmn| in F[z] and monic. The value of the determinant | gmn| is inde- 
pendent of the basis w:, w. and is the norm N(L) of L. A left ideal L is said 
to be equivalent to a left ideal L’ if there exist quantities p, p’ in W for which 
Lp =Lp’ and N(pp’) is monic. 

A form 


(10) S(%1, %2) = + + + 


with a, cin F[z] and 0 in G is called a Hermitian form of G and its determi- 
nant is defined to be bb —ac. We suppose that the x’s run over elements of G. 
If another Hermitian form f’(¥1, yz) can be obtained from f(x:, x2) by a linear 


homogeneous transformation of determinant unity with coefficients in G, 
then f and f’ are called equivalent. 

Let L be a regular ideal with the regular basis w= gmit+Zm2j (m=1, 2). 
Since jw, jw: are in L, 


(11) jom = + Dmawe (m = 1, 2), 
where the b’s are in G. If we designate the general element of L by &, 


= + = (811%1 + + (gi2%1 + £22%2)j, 
SE = + Cows = + + (g1201 + 


where Cn = bind: + Dende (n= 1, 2), and x, x2 are in G. Then 


= X%1 Xe = N(L)-f(xs, x2), 
(1 821 822 


(#) For the rational analogue see C. G. Latimer, On ideals in generalized quaternion algebras 
and Hermitian forms, these Transactions, vol. 38 (1935), pp. 436-446; C. G. Latimer, On ideals 
in a quaternion algebra and the representation of integers by Hermitian forms, these Transactions, 
vol. 40 (1936), pp. 439-449; C. G. Latimer, On the class number of a quaternion algebra with a 
negative fundamental number, these Transactions, vol. 40 (1936), pp. 318-323; J. D. H. Teller, 
A class of quaternion algebras, Duke Mathematical Journal, vol. 2 (1936), pp. 280-286. 


448 LEONARD TORNHEIM [November 


where 


(12) f(%1, x2) = = — + — bei 
1 2 
Since N(£) and N(L) are in F[z], f(x1, x2) is in F(z) for x1, x2 in G. Since f is a 
polynomial in G, it takes values in G. Hence f(x1, x2) takes values in F[z] for 
%1, x2 in G, and f is consequently Hermitian. We say that f corresponds to the 
regular basis w, we. 
The relation between classes of ideals and of forms is described in 


THEOREM(”) 17. There is a one-to-one correspondence between the classes of 
regular ideals of W over G and the classes of Hermitian forms with determinant n 
representing a monic quantity in F[z]. 


We next prove 


LEMMA 2. An ideal L of W is principal if and only if it is regular and any 
Hermitian form f(x:, x2) corresponding to it represents a nonzero quantity in F., 


Let f(x1, x2) correspond to a regular ideal L = (w:, we) and f(r, 72) =ao in F. 
Then do = — bi + bari F2 — where the bm» are defined by (11). If 
then N(p) =a,N(L). The transformation 


= + T2We2, 
= (F#1b11 + + (Fibi2 + Fab22)wW2/a0 


has determinant 1, so that p, p’ is a regular basis of L. But p’ =jp/ao. Hence 
L=(p). 

Conversely, if L = (p), L has the regular basis (p, jp /ro), where ro is the lead- 
ing coefficient of N(p). To this basis corresponds f(x1, x2) = 70x14: —(p/10)x2%e, 
which represents ro=f(1, 0) in F. 

Noticing that in the last paragraph ro determines f, we have the 


CorROLiarY. The number of classes of principal ideals is equal to the index 
of the group of all quantities of F which are leading coefficients of norms of unit 
quaternions in W, in the group of all quantities of F which are leading coefficients 
of norms of quaternions in W. 


We also have the 
CorOLuary. If W is a principal ideal ring, every ideal is regular. 
We next state 


THEOREM 18. A necessary condition that every ideal in W be principal is 
that W be an integral set S. 


(3) C. G. Latimer, On ideals in generalized quaternion algebras and Hermitian formes, these 
Transactions, vol. 38 (1935), p. 442. 


‘ 
¥ 
4 

. 
j 


1940] QUATERNION ALGEBRAS 449 


Let £ equal fy or or according as F has or has not characteristic two, re- 
spectively. Suppose that every ideal in W is principal. Now Sf is in W, and, 
since W(Sf¢) <S(St) = St, we conclude that Sf is an ideal in W. Therefore 
Sf = Ww, with w in W. Since 1 is in both S and W, we have {=yw and rf =w 
with win Wand vin S. Then yyw =w, so that yu =1, and uw are units in W. 
Next, Ww=(Wy)w= We. Finally W=S. 

The conditions of Theorem 18 and the second corollary of Theorem 17 
are by no means sufficient as results at the end of this section show. 

Every Hermitian form f’ of determinant 7 is equivalent to a form 
f with D(bo) <D(a), D(b:) <D(a)SD(c), where 
b=bo+b,i. This result is obtained by successive applications of the two trans- 
formations x{ x7 =%x2;and x{ =x2, xf = —x1. 

We assume in the next two paragraphs that D(a) and D(8) are not greater 
than 1, or D(r) $1, according as F has or has not characteristic two. Then 
D(a)+D(c) =D(n). 

If also D(n) $1, then D(a) =0. We see that f represents a quantity in F, 
and if f corresponds to L, L is principal. Every regular ideal in W over G is 
principal. 

But if D(n)=2, then D(a) <1, and bo and }; are in F. If D(a) =1, 7 is 
monic, and do, ¢o are the leading coefficients of a, c, respectively, then 
AoCo= —1. Hence f(1, ao) is in F and consequently f corresponds to a class of 
principal ideals. This is also true of f if D(a) =0. We conclude that all regular 
ideals of W over G are principal when 7 is monic and quadratic. 

Now let F be a field in which not every quantity is a square. Examples of 
such fields are subfields of real numbers. Also if F is finite of characteristic 
not two, then F contains non-square quantities. For, corresponding to each 
square a’, there are two distinct elements a, —a in the field—the set of 
squares does not exhaust the field. 


THEOREM 19. Let F be a field of characteristic not two in which not every 
quantity is a square. Let o in F{z| be of odd degree and reducible, and r in F(z] 
of even degree with leading coefficient not a square. Then the regular ideals of W 
are not all principal. 


Let ¢ =0102, where oi, o2 are in F[z] and not in F. Then the Hermitian 
form 


where x1=yo+ 11, x2=y2+ysi, with the y’s in F[z], and has determinant ¢ 
and f does not represent a quantity in F. For, y,—7y? and y3—ry3 both have 
even degrees and one of 01, 02 has even degree and the other odd degree. Thus 
(x1, x2) for (x1, x2) ~ (0, 0) has degree at least that of one of 


THEOREM 20. Let F be the field of all real numbers. The integral set S with’ 


450 LEONARD TORNHEIM 


respect to the normalized basis of Theorem 3 is equal to W and is a principal 
ideal ring if and only if o has degree not greater than 1, or degree 2 with positive 
leading coefficient. 


The fact that S is a principal ideal ring when D(a) $1 is a consequence of 
Theorem 8. 

When oa has odd degree greater than 1, Theorem 19 states that W is not a 
principal ideal ring. 

If D(o) =n2=4 is even and @ has leading coefficient +1, then 


f = 0103+ * — * 


where the ¢;=2—4a; (a@;<a;_1) are the linear factors of ¢, cannot represent a 
nonzero quantity in F. For, if we set z=4a,, f is always negative or zero; and 
for z = 43, f is always positive or zero. 

If n=2 is even and a has leading coefficient —1, and if 01, 2 are two non- 
constant factors of ¢=0,02, then (13) cannot represent a quantity in F be- 
cause the two terms o;x;%; and —o2%2%2 have leading coefficients of the same 
sign—there can be no reduction in degree by adding values of these two 
terms. 

We have already shown that when the degree of ¢ is 2 and o is monic that 
the regula: ideals of W are principal. It remains to prove that every ideal in 
W is regular. We can find a basis a, b+-jc of an ideal L with a, b, cin G, since G 
is a principal ideal ring. Since ja and j(b+/jc) are in L, a=a,c, b=byc. The 
ideal (a1, b1 +7) is equivalent to L because Lic/co=L, where Co is the 
square root of the leading coefficient (which is necessarily positive) of N(c). 

We shall show that a; is in F[z]. Now 4:(b1 +7) = is in there- 
fore db; =0 (a:). Let a:=a’a"’ where a’ is the largest factor of a, in F[z];i.e., 
the factors of a’’ divide no linear polynomials in F[z]. Then b:=0 (a’’). Also 
in L; is N(b,+j) —o; hence =0 (a1), (a’’). But isa product 
of linear factors in F[z]; hence a’’ is in F and a, is in F[z]. We can take the 
leading coefficient of a; to be unity. Then LZ; has the regular basis a;, 6: +7 and 
I, is regular. Also L, being equivalent to Zi, is likewise regular. 

Thus, using Theorem 20, we can always determine whether an integral 
set of a quaternion algebra Q having as F the field of all real numbers has 
ideals which are not principal. 


UNIVERSITY OF CHICAGO, 
Cuicaco, ILL. 


ORDER TYPES AND STRUCTURE OF ORDERS 


BY 
ANDRE GLEYZAL(*) 


1. Introduction. This paper is concerned with operations on order types 
or order properties a and the construction of order types related to a. The 
reference throughout is to simply or linearly ordered sets, and we shall speak 
of a as either property or type. Let a and 8 be any two order types. An order A 
will be said to be of type af if it is the sum of 8-orders (orders of type 6) 
over an a-order; i.e., if A permits of decomposition into nonoverlapping seg- 
ments each of order type 6, the segments themselves forming an order of 
type a. We have thus associated with every pair of order types a and 6 the 
product order type af. 

The definition of product for order types automatically associates with 
every order type a the order types aa=a’, aa?=a’,---. We may further- 
more define, for all ordinals \, a Ath power of a, a*, and finally a limit order 
type a’. This order type has certain interesting properties. It has closure with 
respect to the product operation, for the sum of a!-orders over an a!-order 
is an a!-order, i.e., a’a’=a!. For this reason we call a! iterative. In general, 
we term an order type 8 having the property that 68 = iterative. a’ has the 
following postulational identification : 

1. a! is a supertype of a; that is to say, all a-orders are a'-orders. 

2. a! is iterative. 

3. a! is minimal in the sense that all iterative supertypes of a are super- 
types of a’. 

It may be shown that these conditions determine a unique a’, once a is 
given. Accordingly, we term a! the minimal iterative supertype of a. In particu- 
lar, when we prescribe a to be the type, “either normal or reverse normal,” 
a!, it turns out, is the type scattered(*). Thus we find that scattered orders are 
constructible from normal and reverse normal orders by the product opera- 
tion. 

Other fundamental operations on orders, such as taking a segment of an 
order, summing over a normal order, or forming a suborder or superorder of 
an order, lead to the definition of other order types associated with a, and 
to other properties of order types such as descending, extensive, etc. We denote 
these associated order types by @?, a”, a”, a®, and a”. They are unique, de- 

Presented to the Society, December 27, 1939, under the title A general theorem on the 
structure of linear orders; received by the editors January 9, 1940. 

(*) I wish to express my appreciation to Professor H. Blumberg for his generous aid in the 
preparation of this paper. A summary of its principal results is contained in the Proceedings of 


the National Academy of Sciences, vol. 23 (1937), pp. 291-292. 
(*) An order is said to be scattered if it contains no dense suborders. 


451 


452 ANDRE GLEYZAL [November 


pending only upon the choice of a, and the first four of them have closure and 
minimal properties analogous to those described for a’. The type a? is shown 
to be a descending, and a” an extensive order type. Also associated with a 
are two types which we term a-dense and a-scattered. They are, as the names 
indicate, generalizations of dense and scattered. a-scattered is iterative and 
has a certain minimal property with respect to a. Of particular interest is the 
case where a is chosen to be the property of containing 8%, or more elements, 
where §& ) is the Ath transfinite cardinal. We denote the two order types asso- 
ciated with this a by S)-dense and &)-scattered, respectively. No-dense and 
N o-scattered become the properties dense and scattered themselves. 

On the basis of these associated order types there may be developed what 
amounts to an algebra of order types. For example, we may form, by com- 
bination, such types as a?” (meaning B¥, where B=a@”). The property a?#!, 
important in our considerations, is denoted more simply by a7. We find, re- 
markably, that a-scattered is equivalent to (= q NT), 

A principal result is the following one which gives a decomposition of 
every order with respect to every order type. It may be stated as follows. 
If A is any order and a any order type, A is either of type a™ or is the sum of 
a?-orders over an order no proper segment(*) of which has type a™. This is a 
generalization of the well known theorem—due to Hausdorff(*)—that every 
order is either scattered or the sum of scattered orders over a dense order. In this 
paper, the latter decomposition is the one associated with the property “nor- 
mal or reverse normal.” The order type a7, it is found, is simultaneously 
descending, extensive and iterative. Such a property we term transitive. It 
may be shown, furthermore, that a7 is the minimal transitive supertype of a. 
The property a-scattered is transitive for all a, and all transitive order types 
are supertypes of the type scattered. If a is transitive, a7 is equivalent to a 
and the above decomposition theorem implies: If A is any order and aa 
transitive order type, A is either of type a or the sum of a-orders over an order no 
proper segment of which is of type a. 

The decompositions we obtain, corresponding to various particular a’s, 
give insight into the structure of orders and suggest a number of theorems of 
general nature. One such theorem we prove is that every order of regular(®) 
cardinal &) contains either an &)-dense order, or the normal order wy, or the re- 
verse of 

An order J, of transfinite integers is introduced which satisfies the following 
universality conditions: J, is scattered, of cardinal &,, and contains all scat- 
tered orders of cardinal less than &,. 

Problems arise as to properties and methods for constructing orders of 


(*) By a proper segment we understand a segment with more than one element. 

(*) Grundztige der Mengenlehre, pp. 95-97. 

(5) The cardinal N) and the normal order w initiating the cardinal N) are said to be regu- 
lar if every suborder of w cofinal with wy is of type wy. 


he 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 453 


types such as N)-scattered or w,-scattered. The considerations of this paper 
lead also to other problems on orders and order types. A number of these are 
alluded to (see §12), their solution awaiting future research. 

2. Decomposition of an order. Let A be a given order and a@ an order type 
or order property. The problem which we wish to consider is that of obtaining 
a composition of A in terms of orders of type a. Later, we give a formal proof 
of the composition theorem stated in the introduction, but we shall first pro- 
ceed inductively, tracing step-by-step, the ideas leading to the result we have 
in mind. To obtain a segmental decomposition, i.e., a separation of the order 
A into segments, let us begin by associating with an element a of A the ele- 
ments e of A such that the segment (a, e) or (e, a)—taken to include end- 
elements—has property a. The elements thus associated with a form a set S, 
which may or may not be a segment of A. To insure that S, form a segment, 
let us require that a be such that every initial and every final segment of an 
a-order (an order having property a) be likewise an a-order This is equiva- 
lent to requiring that every segment of an a-order be an a-order. This condi- 
tion upon a we express by saying that a is a descending order property. Fur- 
thermore, to insure that two different sets S, have no common elements, we 
ask that the sum(®) of two a-orders be again an a-order. An order property 
obeying the latter condition we term additive. Therefore, if a is a descending, 
additive order property, the order A is the sum of nonoverlapping segments 
S, as defined, and we have determined a composition for A. The segments S, 
themselves form a new order A; if we set S,<S, when a<a’, and we shall 
say that A is the sum of a-orders over the base A;. In the same way as A, the 
order A; may be decomposed with respect to a, yielding a new base order Az 
whose elements are now segments of A;. Since each element of A; is a segment 
of A, we may again consider the elements of A; as segments of A. Continuing, 
we secure, for every integer m, the base order A, with elements interpretable 
as segments of A. A “limit” order A, may be formed as follows. Let a be an 
arbitrary element of A, and S, the set of elements ¢ of A belonging, for some n, 
to the elements of A, containing a. The set S, is a segment of A and the sum 
of such segments constitutes A. Let A, be the order with these segments as 
elements. We may now continue this process beyond the wth stage until 
finally we reach an order A,, where yp is a transfinite ordinal, such that no 
proper segment of A, has property a. The order A is the sum of segments S 
of A over the base order A,. We observe that each segment S may be built 
up from a-orders by means of the following operations: 

(1) Forming an order by substituting a-orders for the elements of an 
a-order or an order already constructed. 


(*) By the sum A+B of two orders A and B is meant the order formed by placing all ele- 


ments of B after all elements of A, no change being made in the relative position of elements 
in A or in B, 


454 ANDRE GLEYZAL, [November 


(2) Forming an order by substituting a-orders or orders already con- 
structed for the elements of a normal or reverse normal order. 

Let us call an order which may be built up from a-orders by means of 
these two operations an a7-order. We may then say that A is the sum of 
a?-orders over a base order no proper segment of which has property a. It 
may be furthermore shown that A has no proper segment with property a’, 
but we defer the proof until later. We choose to start anew making use of the 
notions we have just obtained. 

3. Iterative order type. Let a and f be any two given order properties or 
types, for example, perfect and scattered. We say that an order is of property 
af if it is the sum of B-orders over an a-order. As stated in §1, we term an 
order property a iterative if it has “closure” with respect to the operation of 
summing over itself; i.e., if the sum of a-orders over an a-order is again an 
a-order. 

Suppose a@ is not an iterative order property. We may construct an itera- 
tive property 6 implied by a, as follows: By a! we understand a itself. Sup- 
pose a“ is defined for all ordinals u less than X. An order A will be said to have 
property a’ if it is the sum of a*-orders over an a-order, where u <X and yu is 
permissibly variable. 8 is then the sum type of all types(") a; i.e., an order A 
will be said to have property 8 if it has property a® for some X. We shall de- 
note the property 6 associated in this way with a by a’, the superscript J 
signifying that a! is iterative, as we show later(®). 

We prove, for future reference, that a“+ is a supertype of aa“, Let us 
denote by a* that type which is the sum type of all types a’, where vy <y. Our 
definition of a may then be written a“ =aa*, We may then write a+! = aa*t! 
=a(a“+a*). Hence a“t! is a supertype of aa and the statement holds for 
= 1. Suppose it holds for all ordinals less than \. We may write a#t* = aa‘**, 
Our hypothesis implies, however, that a“* is a supertype of aa“. Therefore 
att} = is a supertype of aaa"=aa“(*), In particular, we note that 
a'*“ is a supertype of (aa”)a=a(a%a) = as would also be ex- 
pected from the relation 1+-w=w. We show that a! has certain minimal and 
uniqueness properties in relation to a, and that these provide a postulational 
definition for a’. 

We introduce, for sets in general, a notion of a minimal property. Let a 
stand for a given set property, and A for a given property of set properties. 


(7) Let S be a set of order types a. By the sum type of the order types of S we understand 
the order type 8 defined as follows. An order will be said to be of type @ if it has property a 
for some a of S; otherwise, it will be said not to have type f. 

(*) It may be true that for a given a there always exists a first ordinal \ such that a is 
iterative, but the author has no proof of this. 

(°) The exponential law a+ =a** holds if the order consisting of a single element has 
type a. For is then a supertype of a” for all and, consequently, =aa"*i = ac", As- 
suming = aa" for all »<), we have a? = aaa" = aa", 


4 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 455 


The set property 6 will be said to be a minimal A-property implied by a, if it 
is implied by a, has property A, and is such that if 8’ is any set property im- 
plied by a and having property A, it is implied by 8. Two minimal A-proper- 
ties implying a are equivalent in the sense that each implies the other. We 
may thus regard the minimal A-property implied by a as uniquely deter- 
mined—if it exists. We shall therefore speak of “the” instead of “a” minimal 
property. 


THEOREM 1. a! is the minimal iterative property implied by a. 


Proof. Suppose an order A is a sum of a!-orders A, over an a!-order 
A= {d}, the subscript \ ranging over all the elements of the order A. Each Ay 
is an a“-order for some ordinal yu. We set o equal to the first ordinal larger 
than any of the ordinals y. If A is an a’-order, A is an a’a’-order. Therefore A 
is an a’+’-order and hence an a!-order. a! is therefore iterative. Now let 6 
be an iterative property implied by a. Assume 8 is implied by a* for u<X. 
6 is then implied by a’ since @ is iterative. Consequently 6 is implied by a for 
all ordinals \, that is, by a’; and the theorem is true. 

There is thus uniquely associated with every a the property a! which is 
the minimal iterative property implied by a; i.e., the minimal iterative type 
which includes a as subtype. We shall say alternatively, that a! is the minimal 
iterative supertype of a. The latter phrasing will also be employed, when con- 
venient, for order type properties other than iterative. 

4. Descending order type. We consider now the property descending for 
order types. Let us denote by a? the property of being a segment of an 
a-order. A segment of an @?-order is a segment of an a-order and conse- 
quently an @?-order. Thus a? is descending. If 8 is a descending property 
implied by a, every segment of every a-order is a B-order and £ is implied by 
a>, Accordingly, we may state 


THEOREM 2. a? is the minimal descending supertype of a. 


One may ask the nature of the properties QDI (=6', where B=a”), or a!?, 
etc., composed by combining the above described processes. We find that 


THEOREM 3. a?! is the minimal descending and iterative supertype of a. 


Proof. a? =(q@?)! is descending, as we have seen. Suppose (@?)*, for ordi- 
nals u<X, is descending. If an order A has property (@?)>, it is the sum of 
(a@?)*-orders, u<X, over an @?-order, and every segment S of A is the sum of 
segments of (a@?)*-orders over a segment of an w?-order. Hence S is the sum of 
(a@?)#-orders over an @?-order and has property (@?)*. Therefore (@?)* is de- 
scending for all ordinals \ and it follows that a?! is descending. By Theorem 1, 
a?! is iterative. Suppose now B is a descending and iterative property implied 
by a. Then 8 is implied by a, and therefore by @?!, for w? and w?! are mini- 
mal. Thus @?! is descending and iterative and is minimal, as was to be proved. 


456 ANDRE GLEYZAL . [November 


5. Extensive order type. We introduce a third property—again a closure 
property—corresponding to the operation described above (§2), of summing 
orders over normal orders or reverse normal orders. An order type a will be 
termed extensive if the sum of a-orders over a normal order or a reverse normal 
order is an a-order. We prove later that the minimal extensive supertype of a 
exists for all a, and is equivalent to oa, where a is the property scattered. In 
conformity with the notation previously employed for supertypes of a, we 
shall denote oa by a. 

6. Transitive order type. The constructions of a, a!’ and a” are based 
on three operations described as follows. a? has closure with respect to the 
operation of taking segments, for a segment of an w?-order is an w?-order. 
Accordingly, we term this operation a D-operation. Similarly, a’ has closure 
with respect to the J-operation of summing orders of a certain type over 
orders of the same type. Also, a” has closure with respect to the E-operation 
of summing over normal or reverse normal orders. We now define an order 
type having closure with respect to all three of the above operations. An 
order property will be said to be transitive if it is iterative, descending and ex- 
tensive. Later it is shown that the minimal transitive supertype of a exists 
and is the order type (o@?)!. 

If a is descending, a single element has property a. Consequently a de- 
scending and extensive order type includes normal and reverse normal orders 
as subtype. Later, we shall see that a transitive order type includes scattered 
as subtype. 

7. Decomposition of an order into a-orders. We prove now the following 
fundamental decomposition theorem: 


THEOREM 4a. If a is a descending and iterative order property, and A an 
order whose normal orders and reverse normal orders have property a, then A has 
property a or is the sum of a-orders over an order no proper segment of which 
has property a. 


Proof. In the special case where A consists of exactly one element, the 
theorem is true. Suppose A has more than one element. We shall say that a 
segment of A is a maximal a-segment if it has property a and no segment 
properly containing it has property a. Every element a of A is contained in a 
maximal a-segment. For let S, be the set of elements ¢ such that the segment 
(a, e) or (e, a) has property a, the symbol { ) signifying that the end points 
of the segment are included. The set S, is a segment of A. For if e’ is an 
element of A between a and an element e¢ of S,, (a, e’) or (e’, a) is a seg- 
ment of (a, e) or (e, a) respectively, and, since a is descending, has prop- 
erty a. Thus e’ is an element of S,. The segment S, has property a. For let 

- be a normal suborder of S, cofinal with S,. A 
segment ¢,4:)—taken to include 4: but not an a-order, since 
is an a-order. By hypothesis, the normal order é2, 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 457 


is an a-order. Thus the suborder of elements of S, to the right of a is the sum 
of a-orders over an a-order and therefore has property a. Similarly, the sub- 
order of S, to the left of a is an a-order. Thus, the segment S, is the sum of at 
most three a-orders and is consequently an a-order, since, by hypothesis, 
every finite suborder of A is an a-order. From our definition of S,, it follows 
that no segment properly containing S, has property a, and S, is hence a 
maximal a-segment. Moreover, no two distinct maximal a-segments have 
elements in common, for the sum of two such segments forms an a-order 
properly containing each of them, contrary to the definitional property of the 
maximal a-segment. Let B be the order consisting of these maximal a-seg- 
ments. We have shown that A is the sum of maximal a-segments over the 
order B. No proper segment of B has property a. For suppose there exists 
such a segment. Then there exists a subsegment (S,, 5S), Sa#%S», of B which 
has property a. Since a is iterative, the set of elements of A composing the 
segment (S,, S;) has property a. It follows that } is an element of S, and con- 
sequently that S,=S,, contrary to hypothesis. The theorem is thus proved, 


THEOREM 4b. If a is a transitive order property and A a given order, then A 
either has property a or is the sum of a-orders over an order no proper segment 
of which has property a. 


Proof. We have seen that a includes the order types normal and reverse 
normal as subtypes. In particular, all normal and reverse normal orders con- 
tained by A as suborders are a-orders, and Theorem 4a applies. 


THEOREM 4c. If a is a transitive order property and A a given order, then A 
either has property a or is the sum of a-orders over a dense order. 


Proof. Suppose A is not an a-order. Then, by Theorem 4b, A is the sum 
of a-orders over an order B no proper segment of which has property a. Every 
proper segment of B consequently does not consist of a finite number of ele- 
ments, since a includes finite order types. We conclude B is a dense order and 
the theorem follows. 

8. Properties of the type scattered. By means of the above decomposi- 
tions, we prove a number of theorems concerning the type scattered. For con- 
venience, we shall denote this type by the symbol ¢. 


THEOREM 5. The order type scattered is transitive. 


Proof. For suppose an order A has property o. Then it contains no dense 
suborders and surely no segment of A contains a dense suborder. Conse- 
quently, o is a descending order property. a is iterative; for let A be an order 
which is the sum of o-orders A, over a g-order A= {X}. Assume there exists a 
dense suborder D of A. No A) contains more than one element of D since 
otherwise A, would contain a dense order. Thus D is a dense suborder of A, 
contrary to hypothesis. A therefore is scattered and consequently o is an 


458 ANDRE GLEYZAL [November 


iterative order property. Furthermore, normal and reverse normal orders are 
subtypes of scattered orders and we conclude that ¢ is transitive. Substituting 
o for ain the decomposition theorem, there results 


THEOREM 6. Every order is either scattered or the sum of scattered orders over 
a dense order("). 


THEOREM 7. Every transitive order type includes the type scattered as sub- 
type. 


Proof. Let a be a transitive order type and A any scattered order. By 
Theorem 4c, A has either property a or is the sum of a-orders over a dense 
order. In the latter case A would contain a dense suborder, contrary to hy- 
pothesis. Therefore A has property a and the theorem is proved. 

9. Minimal supertypes. We now prove the following theorem: 


THEOREM 8. The minimal iterative order type which includes normal and 
reverse normal as subtypes is the type scattered. 


Proof. Let a be the property of being either a normal or reverse normal 
order. We have seen (Theorem 3) that aw?! is descending and iterative. But 
a? =a, and consequently @!=a!'. Thus the minimal iterative property a! 
implied by a exists and is descending. a! is surely extensive since it is itera- 
tive and includes normal and reverse normal as subtypes. Therefore a! is 
transitive and hence contains o as subtype. But, since o is transitive, it is 
iterative, and must therefore include the minimal type a’. Thus a’ is equiva- 
lent to o and the theorem is valid. 


THEOREM 9. If a is any order type, the minimal extensive supertype of a 
is ca, where o is the order type scattered. 


Proof. Let 8 be the property normal or reverse normal. Every 6o-order 
is a o?-order, hence a o-order. Thus Boa=ca, and oa is extensive. Let now y 
be any extensive order type including a as subtype. 7 then includes Ba, hence 
includes BBa=6?a, 68a, etc. Suppose y includes as subtype for It 
then includes the sum of B“a-orders, 1 <X, over a B-order; that is, Ba as sub- 
type. Thus ¥ includes 6*a for all ordinals X. Therefore y includes 6’a as sub- 
type, and since, by Theorem 8, B!=o, we conclude that oa is minimal as 
stated in the theorem. 


THEOREM 10. If a is any order type, the minimal transitive supertype of a 
ts the type (oaw?)! =a? 


Proof. In general, the product of two descending order types is a descend- 
ing order type, and we infer that oa? is descending. By Theorem 3, then, 
(ca)! is descending and iterative. Clearly, (o@?)! includes o, and hence in- 


(#°) Cf. Hausdorff, loc. cit. 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 459 


cludes normal and reverse normal as subtype. Hence, since (ow”)! is iterative, 
it is extensive. Consequently, (ow?)/ is transitive. Now, let 8 be any transitive 
type which includes a as subtype. 6 is descending and includes therefore the 
minimal property a. Likewise, 6 is extensive and includes therefore ca”, 
Again, since £ is iterative, it includes (ow?)’. Thus (ow)! is the minimal tran- 
sitive order type called for in the theorem. 

We denote, for brevity, the transitive property (ow?)! associated with a, 
by a’. Combining Theorems 4b and 10, we may state 


THEOREM 11. If A is any order, and a any given order property, either A 
has property a7 or is the sum of a™-orders over an order no proper segment of 
which has property a7, where a? is the minimal transitive property implied by a. 


We now determine minimal properties for a number of particular order 
properties. Instead of speaking of an order property we will find it conven- 
ient, on occasion, to speak of the set of orders having the property. We intro- 
duce, for this purpose, the following terminology. Let S be a given set of 
orders, and o the property of belonging to S. We understand by the minimal, 
iterative set containing S the set M of orders such that the property of belong- 
ing to M is equivalent to the minimal iterative property implied by o. An 
analogous phrasing will be used for order type properties other than iterative. 


THEOREM 12. The minimal iterative and descending set containing the set 
which consists of the single element wy (w,-reversed) is the set of all normal orders 
(reverse normal orders) of cardinal &, or less. 


Proof. Clearly, the set of all normal orders of cardinal &, or less is itera- 
tive and descending. Suppose all proper initial segments of v, where v is a 
normal order of cardinal 8%, belong to M, the minimal iterative and descend- 
ing set containing wy. If v has a last element, v itself is an element of M since 
the normal orders v—1, 1 and 2 arein M. If v has no last element, there exists 
a normal suborder of ordinals 1, v2,---;¥.,°-- of v, cofinal with v and of 
type w, or less. Thus is an initial segment of the normal order whose ordinal 
is with vy, in M. Consequently is an ele- 
ment of M—similarly for reverse normal orders. 


THEOREM 13. The minimal iterative and descending set containing the set 
consisting of the two elements w, wy-reversed, is the set of all scattered orders of 
cardinal & or less. 


Proof. Clearly, the set of scattered orders of cardinal &, or less is itera- 
tive and descending. Now, let a be any iterative and descending order type 
which includes the type w, and w)-reversed, and let A be any scattered order 
of cardinal §&, or less. By Theorem 12, all normal orders and reverse normal 
orders of A have property a. But by Theorem 4a, A is either an a-order or is 
the sum of a-orders over a dense order. In the latter case A would contain a 


460 ANDRE GLEYZAL . [November 


dense order, contrary to hypothesis. Thus the scattered orders of cardinal N) 
have property a and constitute the minimal set described in the theorem. 

We choose, thirdly, for a particular set of orders, the set S of orders of 
cardinal less than 8). We have seen that by means of the operations of taking 
segments and summing over certain orders, we may construct the orders of 
the set M which is the minimal transitive set containing S as subset. We in- 
quire now as to the nature of an order of M. The decomposition theorem 
shows us that an order A is either an order of M or is the sum of orders be- 
longing to M over an order B no proper segment of which belongs to M. 
Suppose the latter is true. Then no proper segment of B is of cardinal less 
than 8, and consequently every proper segment of B is of cardinal ®, or 
more. This property of B suggests the notion N,-dense which we define as 
follows: An order will be said to be &)-dense if it has more than one element 
and every proper segment of it contains &, or more elements. Thus &)-dense 
is a generalization of the property dense, No-dense being equivalent to the 
property dense. 

Let us return to the consideration of the properties of an order of M. We 
have found that A is either in M or contains an &)-dense suborder. For the 
purpose of insuring that A be in M we need merely specify that no suborder 
of A be &,-dense. A will then be said to be &)-scattered, and in general an 
order possessing no &,-dense suborder will be termed N)-scattered. Thus 
N,-scattered is a generalization of the property scattered, the latter being 
equivalent to N o-scattered. Making the guess that, conversely, all orders of M 
are &,-scattered we venture 


THEOREM 14. The minimal transitive set which contains the set of all orders 
of cardinal &) is the set of S%,-scattered orders. 


Proof. We have seen that the set A of N)-scattered orders is a subset 
of M, the minimal transitive set containing all orders of cardinal less than &. 
We show conversely, that M is a subset of A. By Theorem 5, the set of 
N o-scattered orders is a transitive set. If we substitute N)-scattered for scat- 
tered and &,-dense for dense in the proof of this theorem we secure a proof 
that the property N)-scattered is transitive for all \. Also, A contains all 
orders of cardinal less than %). But M, being a minimal transitive set, is then 
a subset of the transitive set A. Thus M and A are identical and S,-scattered 
is the required minimal property. 

10. Transfinite integer. We now define an order J which j isa generaliza- 
tion of the order of the positive and negative integers, and which is a uni- 
versal scattered order in the sense that it contains all scattered orders as sub- 
orders. By transfinite integer we understand the form: 


n ‘ 
= ayo + +--+ + 


i=1 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 461 


where the coefficients a are ordinary integers, and the exponents a; are ordi- 
nals decreasing as 7 increases. We denote the totality of transfinite integers 
by (1). An element >>? ,aw* of I will be said to be less than an element 

1.0** of I, if for the first index at which the two forms disagree either 
a;<B; or a;=B; and a;<b;. It is seen that J thus becomes a linear order. We 
define sum and product for two transfinite integers in the customary algebraic 
manner. 


THEOREM 15a. I is scattered and every scattered order is similar to a suborder 
of I. 


Proof. Let a be the property of being a suborder of I which is not co- 
initial nor cofinal with J. Clearly, a is descending. It is also iterative. For 
suppose A is the sum of a-orders A, over an a-order A= {)}. A is then iso- 
morphic with the linear order which consists of the pairs (A, a,), where Xd is 
any element of A and a any element of A), ordered first according to \ and 
then according to a,. Let w* be the first power of w such that it is greater 
than, and —w less than, every transfinite integer occurring in the orders A). 
The transfinite integers of the form 


j=l 


where Deda" is an element d of A, and 5>”_,a,«#®™ is an element a, of Ay, 
constitute a suborder of J similar to A, for the correspondence 


j=l 


is biunique and preserves order. a is thus descending and iterative. Suppose 
now A is a scattered order. It is either an a-order or the sum of a-orders over 
a dense order, by Theorem 4c. The latter case cannot occur, for A would then 
contain a dense order, contrary to hypothesis. A is therefore a suborder of J. 
We now show, conversely, that every suborder of J is scattered. Upon 
“writing out” any segment of the order of the transfinite integers, as for ex- 
ample: 1, 2, 3, , w—2, w, wtl1, wt+2, wt3,---; 
++, 2w—3, 2w—2, 2w—1, 2w, 2w +1, 2042, 2w+3, 
-++,w?+nwtm,---, we see that it is locally symmetric in the sense that 
for every Dedekind cut (A, B) of I, A and B both non-null, there exist sub- 
segments A; of A and B; of B, cofinal and coinitial respectively with A and B, 
such that A; is similar to B; reversed(!*). Suppose J is not a scattered order. 


(#1) We introduce J, rather than a segment of J, for convenience, not insisting upon the 
logical character of J as a totality. 

(#2) Moreover, it is clear that J, or segments of J, are the only orders, except for isomor- 
phism, with this local symmetry. 


462 ANDRE GLEYZAL [November 


By Theorem 6, it is the sum of scattered orders A, over a dense order A= {)}. 
No final segment of A,, where p is a fixed element of A, can be similar to the 
reverse of any initial segment of the set of elements to the right of A,. For 
every such segment contains a dense order and J would then not have local 
symmetry, contrary to fact. We conclude J is scattered. 

We shall say that a transfinite integer a is of cardinal NS, if there are N, 
elements in the segment (0, a). The segment of J comprised of all elements of 
cardinal less than &, we shall indicate by J,. 


THEOREM 15b. I, is scattered and contains as suborder all scattered orders of 
cardinal less than &,. 


Proof. Of course I,, being a suborder of J, is scattered. Let us assume, 
first, %,, is a regular cardinal. Substituting J, for J in the proof of Theorem 15a 
there results a proof that the property a of being a suborder of J, not coinitial 
nor cofinal with J, is descending and iterative. If A is a scattered order of 
cardinal less than N,, all normal and reverse normal suborders of A are of 
cardinal less than 8%, and hence have property a. As above, it follows that A is 
a suborder of I,. The theorem is thus proved for regular cardinals 8 ,. Suppose, 
now, N, is not regular. Then %, has no cardinal which is an immediate pred- 
ecessor and is therefore expressible as a sum of regular cardinals &,, with 
v <p. Consequently, if A is a scattered order of cardinal less than N,, we rea- 
son A has cardinal less than some regular cardinal &,, with vy <y. Therefore A 
is a suborder of J,. Inasmuch as I, is a segment of I,, A is a suborder of J,. 
The theorem is thus proved for all cardinals N,. 

There are 2®, scattered orders of cardinal &,. Accordingly, one may form 
an order S which is scattered and contains as suborders all scattered orders 
of cardinal less than Ni, for example, simply by summing all such orders over 
a normal order of cardinal 2%*, Let us compare the two orders S and J, both 
of which are scattered and contain all scattered orders of cardinal less than 
N,. I, is certainly of cardinal 8; whereas S is of cardinal 2**. Hence, they are 
both of cardinal Ni, only if the continuum hypothesis 8, = 2®* is true. 

Let us now compare I, to the order S composed by summing all scattered 
orders of cardinal less than &%, over a normal order. This normal order must 
be of cardinal Thus S is of cardinal If S,41=2%», then 
S.=N, and S is of cardinal &,. As we have seen, J, is of cardinal %,. We may 
say, therefore, that our knowledge of whether the cardinal of S is as small as 
that of I, depends upon the validity of the generalized continuum hypothesis. 

11. The order types a-dense and a-scattered. Starting with the order 
property “has cardinal less than &,,” we were led to the order properties 
S,-dense and &)-scattered. In a similar fashion, starting instead with an arbi- 
trary order property, we are led to the properties we now describe. An order 
every proper segment of which has more than two elements and contains a 
suborder of property a will be termed a-dense. An order containing no a-dense 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 463 


suborders will be termed a-scattered. If a is the property of containing ®) 
or more elements, a-dense and a-scattered are equivalent to the previously 
defined properties N,-dense and N)-scattered, respectively. Substituting 
a-dense for dense and a-scattered for scattered in the proof of Theorem 5, 
we may prove 


THEOREM 16. If ais any order property, a-scattered is transitive. 


We note the following alternative wording of the definition of a-scattered. 
An order A is a-scattered if every suborder is not a-dense. I.e., there exists a 
proper segment of every suborder which either has exactly two elements or is 
such that every suborder has property a, where a” denotes the order prop- 
erty: “not of type a.” In particular, an N o-scattered order, that is, a scattered 
order, may be defined as an order which has the property that every suborder 
of it contains a segment consisting of exactly two elements. By comparing 
these definitions of scattered and a-scattered, it becomes apparent the type 
a-scattered includes scattered orders as subtype. This would, of course, also 
be inferred by Theorems 7 and 16. 

Since a-scattered is transitive, it furnishes a second decompositicn of 
every order as follows. Every order is either a-scattered or the sum of a-scat- 
tered orders over an order B no proper segment of which has property a-scat- 
tered. That is to say, every proper segment of B, if it exists, contains an 
a-dense order. We conclude that 


THEOREM 17. If a is any order property and A an order, A is either a-scat- 
tered or the sum of a-scattered orders over an a-dense order. 


a-scattered, being transitive, is equivalent to some property B? when 8 
is properly chosen. We may ask for an “economical” way to describe 6 in 
terms of a. We have, of course, an “extravagant” solution if we set 8 equal to 
a-scattered. A more “thrifty” answer is developed as follows. We have seen 
that every suborder of an a-scattered order has a proper segment which either 
consists of exactly two elements or is such that every suborder has property 
a, Thus a-scattered orders contain scattered orders but no a-orders. We try 


THEOREM 18a. If a is any order property, the order property a-scattered is 
equivalent to the order property &7, where & is the property of containing no sub- 
orders of property a. 


Proof. An &-order is a-scattered since it contains no a-orders and hence, 
surely, no a-dense order. Therefore, a-scattered is a transitive order type 
which includes 4-orders as subtype. Since &7 is the minimal transitive order 
type which includes & as subtype, a-scattered includes 47 as subtype. Sup- 
pose, now, A is an a-scattered order. By Theorem 11, either A has property 
a7 or is the sum of &7-orders over an order B no proper segment of which has 
property @7. In the latter case B has, in particular, no proper segment with 


464 ANDRE GLEYZAL [November 


property a. Consequently, if B exists, every proper segment of B contains a 
suborder with property a—i.e., contains an a-dense suborder. Thus A would 
contain an a-dense suborder, contrary to hypothesis. We conclude that A has 
property a7. The latter property is therefore equivalent to the property 
a-scattered. 

We determine a construction for a-scattered in terms of operations on 
a-orders. Let us consider &, the order property “containing no a-order as 
suborder.” This is equivalent to the property “every suborder has property 
a.” Now, if we denote by a” the property of being a superorder of an a-order, 
the property & is equivalent to the property a®”, Accordingly, Theorem 18a 
may be written: 


THEOREM 18b. a-scattered = a?NPE!, 


Just as a2 =a, so a®® =q®, We shall term a* rising, and, in general, if 
for an order property a, a” is equivalent to a, that is, if every superorder of 
an a-order is an a-order, we shall term a rising. It is clear a” is the minimal 
rising property implied by a. On the other hand, the type a” has the prop- 
erty that a suborder of an a@®"-order is an a®’-order, and we term a?" falling. 
More generally, we shall name an order property a falling if every suborder 
of an a-order is an a-order. If a is any order type we shall denote by a? the 
property of being a suborder of an a-order. Manifestly a” is the minimal fall- 
ing supertype of a. Again we have a “closure equation” a?¥ =a¥, We note 
also, for future reference, that if a is a falling property, a” is a rising property 
and conversely. Thus a?” is rising and a” is falling, for all a. 

In the equation a-scattered =a*%24!, we may regard the RNDEI-oper- 
ation as a “solution for X” in the conditional equation a-scattered =a*. 
Conversely, we inquire as to possible “solutions” of the equation a*-scat- 
tered =a, where by X we have in mind an operation corresponding to some 
combination of the letters F, R, J, etc. Let us assume there exists a solution. 
a*-scattered is a transitive order type. Clearly, it is also falling. Therefore 
a? =a, aT =a, and a? = Combining, we obtain a*-scattered 
But, by Theorem 18b, we may substitute a* #422! =qX8NT for q*-scattered. 
Our conditional equation becomes a*2"T=q??, This is true if a*®% =a’, or 
aX® = The latter equation is equivalent to since is rising. 
Finally, a¥® =a" is implied by a* We state 


THEOREM 19. If ais any order type, theminimal falling and transitive super- - 
type of ais T =a¥%-scattered. 


Proof. We prove the equivalence of a¥7 and a?*-scattered directly. For, 
since scattered = =QFNRNT gFT is 
minimal for, by Theorem 10, a?? = (ca)! = (aa”)! where o is the order type 
scattered. The product of two falling order types is, in every case, falling. 
Thus oa” is falling. In Theorem 3 we proved that a! is descending if a is 


1940] ORDER TYPES AND STRUCTURE OF ORDERS 465 


descending. In a similar fashion we may show a is falling if a is falling. We 
conclude (ca’)! =a? is falling. Furthermore, a7 is transitive since the 
T-operation is performed last. Every falling and transitive order type B 
which includes a@ includes a7. For 8 includes a implies 8 includes the mini- 
mal a’. Since 6 is transitive, includes a’, and a7 is the minimal transitive 
type which includes a’, 6 includes a?7. Thus a’? = a? %-scattered satisfies the 
requisite minimal condition of the theorem. 

We are now in a position to “solve” for X in the equation a*-scattered = a. 
There is a solution if and only if Then a?*’-scattered =a’? =a; 
or a’ -scattered =a. Thus X = N is a solution. 


We next establish the following characterization for the type a-scat- 
tered. 


THEOREM 20. If ais any order type, the order type a-scattered is the minimal 
falling and transitive supertype of a®*. 


Proof. The minimal falling and transitive type which includes a” is, by 
Theorem 19, Since is falling, and a®*¥T=qQFNT, But 
a®NT = q-scattered. Thus = q-scattered, proving the theorem. 

The above results show the class of order properties a-scattered is identi- 
cal with the class of order properties which are both transitive and falling. 
Moreover, a transitive order property is not, in every case, a falling order 
property. For example, a7, where a is the order type of the continuum, does 
not include the order type of the rational numbers as subtype. 

In regard to the notion a-dense, it turns out the equation a* = (a-dense)” 
has a “solution” ("), For an order is a-scattered if no a-dense suborders exist. 
An equivalent statement is that a-scattered = (o-dense)**%. Thus (a-scat- 
tered)¥ = q2NTN = (q-dense)*. 

12. Properties of orders with 8, elements. The decomposition of Theo- 
rem 4a shows that 


THEOREM 21. Every order of cardinal 8%) containing neither wy nor wy-re- 
versed, with S&) regular, is the sum of orders each of cardinal less than ®) over 
an &)-dense order. 


Proof. The property a of containing less than & ) elements is iterative and 
descending. With this choice of a, Theorem 4a becomes the above theorem. 

Thus every order of cardinal &,, containing neither w, nor w,-reversed, 
with &, regular, contains an &)-dense order. It follows that 


(#8) It seems that in the equation a* =a-dense there is no solution for X in terms of the 
letters D, EZ, I, F, R, N. We may, however, construct an a-dense order, once a is given, as fol- 
lows: Let A1, As,-+*, An,*** be a series of a-orders. We may form a development 
G02°*+Gn+++, where the entry a, is an element of Ay. The set of all such developments, 
ordered lexicographically, constitutes an a-dense order. The set of all such orders determines 
an order type which might also appropriately be termed an wth power of a. Cf. §3. 


466 ANDRE GLEYZAL 


THEOREM 22. Every order of regular cardinal 8) contains either w,, or 
wy-reversed, or an &)-dense order. 


Order types a which may serve as a starting point in the formation of new 
order types and in the study of the structure of orders are: dense, closed, per- 
fect, of cardinal S), wa, pr("4), etc. We may then form the order types a, a, 
al, a®, a, a and any combination of these to form new order types 8. With 
each of these §’s is associated the transitive types 87 and B-scattered each of 
which provides a segmental decomposition of every order. A number of these 
associated types and decompositions have been considered. in this paper. 
Others, of possiblé interest, we leave to future investigation. We note, too, 
the possibility of introducing order types associated with a by other means 
such as classification according to properties of Dedekind cuts, properties 
of initial segments, etc. (15). 


(4) See Hausdorff, Mengenlehre, pp. 180-185. 
(5) See Hausdorff, Mengenlehre, pp. 142-147. 


St. MIcHAEL’s COLLEGE, 
WInoosk! Park, VT. 


EXPANSIONS OF ANALYTIC FUNCTIONS 


BY 
R. P. BOAS, JR. 


Introduction. There is an extensive literature dealing with the problem of 
expanding analytic functions of a complex variable in generalized Taylor se- 
ries of the form 


(1) f(s) = 


where the g,(z) are, in a suitable sense, “nearly” the functions 2*(*). If 
gn(z) =2"[1+h,(z)], where the 4,(z) are analytic and bounded in a circle 
| z| <r and vanish at z=0, and f(z) is analytic in | z| <r, the possibility of 
an expansion of the form (1) was established by S. Pincherle [9]; the series 
converges to f(z) in some circle |z| <s, where in general s<r. Much of the 
later work has been devoted to obtaining better estimates for the number s. 
In this paper, a new attack on the problem is developed; it eliminates re- 
arrangements of power series, and uses a criterion for “nearness” of two se- 
quences of functions which is essentially contained in work of Paley and 
Wiener [26, p. 100] (where it is applied to another problem). The results in- 
clude some of those of G. S. Ketchum [4], which are the most precise yet 
obtained, and in part go beyond them. Well known expansion theorems of 
G. D. Birkhoff [1] and J. L. Walsh [17] are also obtained. 

The simplest of my results (and the most convenient one for applications) 
is that if the functions g,(z) in (1) are of the form specified above, and if the 
h,(z) have a common majorant h(z) for large m (that is, if the coefficients in 
the power series of h,(z) are less in absolute value than the corresponding co- 
efficients of h(z)), then the expansion (1) converges to f(z) in | z| <s if h(s) <1. 
For example, if 1+4,(z)=e, with lim sup,.. | an| S1, we may take 
h(z) =e“+9*—1 (with any positive €), so that the region of convergence of 
(1) is at least |z| <log 2; I have not been able to establish convergence in 
a larger region than | z| <1/e by using the theorems in the literature(’). 

It is also possible to restrict linear combinations of the coefficients of the 


Presented to the Society, April 27, 1940; received by the editors March 18, 1940. 

(*) The bibliography at the end of this paper contains all the references which I have 
found (without however making an intensive search of the literature) on general expansions of 
this type. For special theorems, other than those considered in this paper, see especially G. S. 
Ketchum [4]. (Numbers in brackets refer to the bibliography.) 

(*) Added in proof: Ibragimoff [32] has proved that every function analytic in || <s is 
the uniform limit in || $s’ <s of a sequence of linear combinations of the functions in question 
if s Slog 2. 


467 


468 R. P. BOAS ; {November 


h,(z) instead of the coefficients themselves; this can be done by a method 
different from that used by G. S. Ketchum in obtaining the first such results 
(see §5). Another generalization consists in modifying the assumption that 
the functions g,(z) should have precisely the form z"[1+4,(z) ] (see Theorem 
6.4). 

The expansion theorems of this paper were originally developed in the 
hope (which has so far proved illusory) of settling a conjecture concerning 
the values taken by derivatives of entire functions. However, I have obtained 
some new results in this field. In particular, I prove the following theorem 
(Theorem 7.1): If f(z) is an entire function of exponential type k <log 2, with 
f(0) =1, and if the points a, (n=0, 1, 2,- ~~) are in the circle | s| <1, then 
for every r<k 


n=O 


This generalizes a theorem of S. Takenaka(*) which states that f‘” (a,) cannot 
be zero for all n. 

Many of the papers listed in the bibliography treat, besides the conver- 
gence of the series (1), the existence of systems of functions biorthogonal to 
the g,(z), the form of the coefficients in (1), etc. These problems are not con- 
sidered in this paper, although its methods could be made to furnish informa- 
tion about them. - 

Some of the results of this paper were announced, with indications of the 
proofs, in a note in the Proceedings of the National Academy of Sciences(‘). 

1. Abstract expansion theorems. We consider a normed complex linear 
space E, and a sequence G= {x,} of elements of E. G is said to be a funda- 
mental set if the set of all finite linear combinations of elements of G is every- 
where dense in £; that is, if for every y e E there exist complex numbers 
Cx.n such that 


(1.1) y=lim 


G is said to be a base if every element y e E has a unique representation as 
an infinite series of multiples of elements of E; that is, if for every ye E 
there exists a unique sequence of complex numbers c, such that 


(1.2) y = lim Dd cere. 


kel 


The following theorem states in effect that a sequence sufficiently near 


(*) See J. M. Whittaker [30, p. 44]; Takenaka [29]. 
(*) Vol. 26 (1940), pp. 139-143. 


1940] EXPANSIONS OF ANALYTIC FUNCTIONS 469 


another sequence which is a fundamental sequence or a base is also a funda- 
mental sequence or a base. 


THEOREM 1.1. Let the sequences G={x,} and H={y,} have the property 
that for some number (0<X<1), and for all finite sequences ai, , an 
of complex numbers, 


N N 


Then 
(i) if Gis a fundamental set, so is H; 
(ii) if E is complete and G is a base, H is a base. 
In case (ii), furthermore, if the element x e E has the expansion 


& ks 
kewl 


the coefficients c;, have the property 


= | 1 
(1.4) || S llell. 


Theorem 1.1 (ii), in the special form which it assumes when G is a normal 
orthogonal base, was given (with a proof which applies to a general base G) 
by Paley and Wiener [26, p. 100] for the Hilbert space L?(—7, 7). For a gen- 
eral Banach space, the proof given by Paley and Wiener needs only formal 
modifications; in this paper, Theorem 1.1 (ii) will be used almost exclusively 
for Hilbert spaces, and is consequently established by the proof of Paley and 
Wiener (since all realizations of abstract Hilbert space are equivalent). We 
omit the proof of Theorem 1.1 (ii). 

The proof of Theorem 1.1 (i) is considerably simpler; this part would be 
sufficient for the applications which will be made in §7 to derivatives of ana- 
lytic functions. We suppose that G is fundamental, that H is not, and that 
(1.3) is satisfied. Then there is a linear(®) functional f, defined on E, such that 
S(yn) =0, n=1, 2,---, while f(z) #0 for some z. Let 


S(%n) = — Yn) = Cn (n = 1, 2,°°:). 


Let M=\lfl|; that is, let M be the smallest number such that, for all x e E, 
|f(x)| <M||x||. Then for any sequence {a,} 


N N N 
D | S an(%n — yn) || S Do 
n=1 n=l 


n=l 


(*) “Linear” means “distributive and continuous,” as in Banach’s book [21]. 


| 


470 R. P. BOAS [November 


Hence(*) there is a linear functional g, defined on E, such that g(xn) =c, 
(n=1, 2,---), and ||g|| MA<M. But since {x,} is a fundamental set and 
S (xn) —g(xn) =O (n=1, 2,--+-), we must have f(x) =g(x) for every x, and 
consequently M=||f|| =||g|| s\14<.M, a contradiction; for M is not zero be- 
cause f(z) #0 for some z. 

2. General expansions of analytic functions: We now apply Theorem 1.1 
to the spaces H,(r) whose elements are functions f(z) analytic in | z| <r, be- 
longing to L? (p21) in this circle; that is, each function f(z) is assumed to 
satisfy(7) 


1 1/p 
2rd o 


where A depends only on f. It is well known(®) that if f(z) satisfies (2.1) it 
has boundary values almost everywhere on |z| =r, and that the boundary 
function belongs to L®. We complete the definition of H,(r) by defining the 
norm of f(z) by the relation 


We introduce, to save repetition, the following 


DEFINITION. A sequence {f,(z)} of functions analytic in |2| <r and belong- 
ing to some class H,(r) (1S pS @) has Property T in | s| <r if every function 
f(z) analytic in |z| <r and continuous in |z| Sr can be expanded in a unique 
series of the form 


(2.2) = enfals), 

n=l 
the series converging uniformly in every circle | z| Sr’'<r. If furthermore the 
series in (2.2) converges uniformly in | z| Sr, the sequence has Property Tx. 


The sequence (1, 2, 2”, - - - ) is an obvious example of a sequence having 
Property T,, in any circle. 
Applied to the spaces H,(r), Theorem 1.1 yields 


THEOREM 2.1. Let {f,(2)} and {gn(z)} be two sequences of elements of H,(r)s ; 
such that for some numbers p and x (1SpS ~~, 0<XA<1), and for all sets of 
complex numbers a, d2,°** , an 


(*) Banach [21, p. 56]. The result remains valid for complex linear spaces: see Bohnenblust 
and Sobezyk [23]. 

(7) Expressions involving p are to be interpreted according to the usual conventions when 
p= ©: that is, as the limits as p— © of the corresponding expressions for finite p. 

(*) See, e.g., Zygmund [31, p. 162] 


EXPANSIONS OF ANALYTIC FUNCTIONS 
2r) N 1/p 
0 n=l 
2r| N 1/p 
f DX anfa(re*) ao 
0 


n=l 


(2.3) 


Then; in |z| <r, {gn} has Property T if {f,} has Property T; if (2.3) is satisfied 
with p=, {gn} has Property T.. if {f.} has Property T... Moreover, if the 
expansion of f(z) in terms of {gn(z)} has the form 


(2.4) f(z) = 


the coefficients c, have the property 


k=l 


The direct deduction from Theorem 1.1 is that the series in (2.4) con- 


verges to f(z) in the topology of H,(r). In case p= ©, this is the desired con- 
clusion. Otherwise, if | z| sSs<r we have 


N 1 N d 


n=l n=l 


an application of Hélder’s inequality shows that 
N 
lim | f(z) — > Cnn(z) | = 0, 
n=l 


uniformly in |z| Ss. 
We shall use Theorem 2.1 most frequently in the special case when 
fn(z) =2"—". It then becomes 


THEOREM 2.2. The sequence {g,(z) } has Property T in | | <r if, for all sets 
of complex numbers ao, 1, , Gn, 


— gn(re)] Cm 


n=0 


(2.6) 
Pp Ip 
s Mf > ao 
0 n=O 
where p and satisfy 1S pS, 0O<dA<1. If (2.6) is true with p= the se- 
quence has Property T.. 


From Theorem 2.2 we can deduce in a few lines the following generaliza- 
tion of expansion theorems of G. D. Birkhoff [1] and J. L. Walsh [17]. 


1940] 


472 R. P. BOAS (November 


THEOREM 2.3. If the functions g,(z) are analytic in | z| <r, continuous in 
|z| Sr, and satisfy 


(2.7) — < | | = 7, 
n=O 


the series converging uniformly, then the set { gn(z) } has Property T., in’ every 
circle |z| <s&r. 


In Birkhoff’s theorem, (2.7) is replaced by 
(2.8) gn(z) — <1, |2| = +; 
n=O 


this condition implies (2.7), by Cauchy’s inequality. In Walsh’s theorem (2.7) 
holds, and in addition the series in (2.8) is assumed to converge. 

We apply the case p= ~ of Theorem 2.2. The sum of the series in (2.7) 
is continuous on |2| =s, when s Sr, and so has a maximum \?<1. We have, 
with z=e*, 


N 


max | a,[g,(z) — 2”] 


| n=O 


s ( >| max ( 2 s~2n| £n(z) — 2” 


1 2r| N 1/2 
f Dd ) 
0 


n=0 


n=0 


N 
| n=O 


This establishes (2.6) with p= ©, and Theorem 2.3 follows. 

In this section we have applied part (ii) of Theorem 1.1. The weaker 
part (i) would yield a weak form of Property T with the uniformly convergent 
series replaced by a uniformly convergent sequence of linear combinations. 

3. Criteria for the existence of expansion theorems. From now on, we 
shall use Theorem 2.2 exclusively in the case p =2, which is the case in which 
criteria for the validity of (2.6) are most easily set up. Our functions g,(z)_ 
will, in this section, be of the form 


(3.1) 8n(2) = [1 + 
where 
(n) k 


(3.2) nip 


kewl 


— 

(n = 0,1, 2,-++), 


1940] EXPANSIONS OF ANALYTIC FUNCTIONS 473 


We assume to begin with that the /,(z) have a common majorant h(z) ; that is, 
that 


where h(z) => (|z| This restriction will be considerably relaxed 
in §4. We introduce the quantity K, by the definition 


1 


THEOREM 3.1. The functions g,(z) have Property T in any circle | z| <s pro- 
vided that one of the following conditions is satisfied: 


(3.4) h(s) < 1, 


p 
3.5 s< su 
( ) min (K? + 1)1/2 


We have to verify (2.6) with p =2,r=s, for an arbitrary set (do, a1, - - -, ay). 
We write 


of = n=0,1,---,WN, 
0, 


¥(2) = On | 2”, Vn(z) = On | 2”. 


n=C 


Then condition (2.6) takes the form 


1 2n oo 2 
— f Dd af (re) | do 
2rJo | 
(3.6) 
< | a,! |2y2m, 
n=0 


@(r) can be rewritten as follows. 


kewl 


1 


m—1 (n) 2 
=— r on Y¥m—n | 


2rJo 


In the first place, we evidently have 


(3.7) 
n=0 
= > * 
n=O 


474 R. P. BOAS 


m—1 2 
If we retrace the steps in (3.7), we then find 


1 2n 2 
®(r) Dd | | | do 
2r Jo n=O 


== an 


Thus if h(r) <1, (3.6) is satisfied; this proves Theorem 3.1 under condition 
(3.4). 
We now observe that the expression 


m—1 
D | as | 
n=O 


which occurs on the right of (3.8) is the coefficient of 2” in the power series of 
Wm—1(2)h(z), and consequently can be written as 


1 h(2z)Wm—1(2) 
imp 


gmt 


Hence its square does not exceed 
1 1 2n 1 K, m—1 
f | wipe) — = —* S| of 
Indo 2r Jo 


From (3.8) we now obtain, if r<p, 
2 r 2m m—1 
mai \ n=O 
2m 2 «© 
= K;—— | af 
p 0 


1 


= og |" 

n=O m= 

Then (3.6) is satisfied if we can choose p so that K?r?/(p?—r*) <1, or so that 
p? 


r< 
K?+1 


[November 


1940] EXPANSIONS OF ANALYTIC FUNCTIONS 475 


This shows that the relation (3.6) is satisfied if r=s and s satisfies inequality 
(3.5). 

In earlier theorems of the same type as Theorem 3.1, the conditions have 
restricted the coefficients 5, of the function h(z) majorizing the h,(z); here 
we restrict only the behavior of /(z) in the large(*). Two less precise known 
theorems can be deduced as corollaries of Theorem 3.1. 


THEOREM 3.2(!), If | itn(2)| M(p) (n =0, 1,2, - ;|2| Sp), then the func- 
tions gn(z) have Property T in |z| <s if 
(3.9) s< sup a 
ose<r, M(p) + 1 
In fact, Cauchy’s inequalities for derivatives yield, for 0<p<ro, 
| ve” |< (m= 


Consequently, if p<ro, we can take 5,= M(p)p~* (k=1, we then 
have h(r) =rM(p)/(p—r), and h(r) <1 if r<p/[M(p)+1]. If we choose p in 
the most favorable way, Theorem 3.2 follows fromm Theorem 3.1. 


THEOREM 3,3(""). If 


(3.10) L,= sup (p < 10), 


the functions g,(z) have Property T in | z| <sif 


3.11 s< su ‘ 


We have 
& p*L,; 
r\* 
h(r) L, (+) 
n=i 


=L, 


hence h(s) <1 if s satisfies (3.11). The conclusion follows from Theorem 3.1. 

For use in §5, we note the following property of the coefficients in the ex- 
pansion whose existence is established by Theorems 3.1 and 3.2. If f(z) has 
the expansion 


(*) However, Theorem 3.1 (even as generalized in §4) does not seem to include Theorem III 
of G. S. Ketchum [4]. 

(*°) Narumi [7], Takenaka [15], G. S. Ketchum [4]. 

() Graesser [2]. See also G. S. Ketchum [4, p. 215, footnote]. 


"<a 


R. P. BOAS [November 


f(z) = CnBn(2), |z| <s, 


there is a number A(s), not depending on f(z), such that 


(3.12) | cn f | f(se#) |2d0. 
n=0 0 

This follows from the last part of Theorem 2.1. 

4. Improvement of the criteria. The conditions established in §3 can be 
generalized by restricting the h,(z) only for large n. The generalized condi- 
tions occur in part in the literature, and in part are new. It turns out that 
only the behavior of the /,(z) for large is relevant to the existence of ex- 
pansions of the type which we consider, as the following lemma shows. 


LemMA. Suppose that gn(z) and g.*(z) are analytic in |2|<ro, and 
gn(2) =g.*(z) for m>N. If {gn(z)} has Property T in |2| <si<ro, and {g,*(z)} 
has Property T in every circle |2| <s* Sse, where ro=52>51, then { gn(z) } has 
Property T in | < 52. 

Let F(z) be an arbitrary function analytic in |2| <s: and continuous in 
|| Sse, and let 


(4.1) = cabal), 


where the series converges uniformly in any circle | z| Ss{ <s,. Define a func- 
tion G(z) by the relation 


N 
(4.2) G(z) = F(z) — zx = > = (2). 


n=N+1 n=N+1 


Now G(z) has a unique expansion of the form 


(4.3) G(z) = 


the series converging uniformly in | z| Ss?{ <s2. By comparison with (4.2), we 
see that d,=0 (n=0, 1, 2,---, N). Hence the series in (4.1) converges uni- 
formly in | z| Ss7, and necessarily converges to F(z). Since s/ is any number ~ 
less than s2, the proof is complete. 

We now suppose, as in §3, that g,(z) =z"[1+4,(z) ], where 


h,(z) = 
kml 


We suppose further that 


476 
n=0 
n= 


EXPANSIONS OF ANALYTIC FUNCTIONS 477 


ve | 38,8, 2,---), 


where the series 


H,(z) = 
kewl 


converge in | z| <ro. We introduce, for p<7ro, the quantities 


2 eee (n) (n) k 
Kon p) ’ L, = sup p 
kewl 
and we then set 


K, =limsupK,», L,=limsupZ,”, —h(r) = lim sup 


Then we can state 


THEOREM 4.1. The functions g,(z) have Property T in any circle | z| <s pro- 
vided that s satisfies one of the following three conditions(") : 


(4.5) h(s) <1, 


4.6 s< su 
| (K? + 


p 
4.7 s< su . 


Theorem 4.1 states that Theorems 3.1 and 3.3 remain valid when K,, L,, 
and h(r) are defined by (4.4). The theorem follows at once from the lemma, 
with g,*(z) =2" for n<N, where N is chosen sufficiently large. It is only neces- 
sary to verify that the functions g,(z) have Property T in some circle | z| <5. 
An application of Theorem 3.1 shows at once that this is true, with (for ex- 
ample) s; such that 


sup H,(s:) < 1. 


Alternatively, we may suppose that, for =0,1,2,---, 
| | Malo) (|s| So), 
and that 
(4.8) M(p) = lim sup M,(r) 


is finite. Then we can state 
(*) For the theorem under (4.7), see G. S. Ketchum [4]. 


1940] 


478 R. P. BOAS [November 


THEOREM 4.2(), The conclusion of Theorem 3.2 holds if the quantity M(p) 
in (3.9) is defined by (4.8). 


Expansions in terms of functions g,(z), analytic in | z| <ro, are particu- 
larly interesting if the expansion of every f(z) which is analytic in | z| <s con- 
verges in every circle | z| Ss'<s (where naturally sSro). This property 
(which we may call Property U) is possessed, of course, by the functions 2". 
Using the theorems of this section, we can easily obtain the following suffi- 
cient conditions for a set g,(z) to have Property U: 


(4.9) (ro) < 1, 
(4.10) L,, = 0, 
(4.11) M(ro) = 0, 
where h(ro), L,,, and M(ro) denote the limits of the respective functions of r 


(defined in (4.4) and (4.8)) as r—ro. Condition (4.11) shows, for example, 
that if 


(4.12) hp(z) = o(1), 


uniformly with respect to z in each circle |z| Sr’<ro, then the set {g,(z)} 
= {z"[1+h,(z)]} has Property U. This result was obtained by Sheffer and 
by Takahashi(); it generalizes a result of Widder [20], in which the condi- 
tion h,(z) =O(1/n) appears instead of (4.12). Condition (4.9) will sometimes 
establish Property U when (4.12) is not satisfied. For example, if 


(n) k 


h,(2) Z, 
ken 


1 


is” 
k(k + 1) 


we have 
1—z s* 

h(z) = 1 — ——- log (1-2) = 
and h(r) <1 if r<1. In this case the corresponding functions g,(z) have Prop- 
erty U in |z| <1, although neither (4.10), (4.11), nor (4.12) is necessarily 
satisfied. 

5. Further generalizations. In Theorems 4.1 and 4.2 we made restrictions 
on the individual coefficients in the power series of the functions h,(z). In 
this section a method will be developed for replacing such restrictions by re- 


(4) Takenaka [15], G. S. Ketchum [4]. 
(*) Sheffer [10, pp. 588, 597], Takahashi [13]. Cf. G. S. Ketchum [4, p. 215]. 


1940] EXPANSIONS OF ANALYTIC FUNCTIONS 479 


strictions on linear combinations of the coefficients. The results obtained 
(which could evidently be generalized still further) overlap those of G. S. 
Ketchum [4], who first obtained such results. 

Suppose that L is a one-to-one linear operation from H:(r) to H:(r) (so 
that L has a linear inverse). If the set of functions G,(z) = L[g,(z) ] is a base, 
and f(z) is an arbitrary element of H2(r), we have a unique expansion 


= 


since L~! is continuous, we have 


= 
if we also have 


f(z) = basalt) 


= 


and a,=b, (n=0, 1, 2,- ~~). The convergence is convergence in the topology 
of H,(r); this, as we have seen, implies uniform convergence in every circle 
|| <1r’<r. Hence an expansion theorem for the functions L[g,(z) ] gives rise 
to an expansion theorem for the g,(z) themselves. A trivial, but not unim- 
portant, illustration is given by the operator L which transforms g,(z) into 
cue where o(z) is analytic and bounded in | z| sr, o(0) =1, and o(z) #0 
in |2| Sr. 

We now discuss a case which is not entirely covered by the procedure just 
outlined ; it includes some of the results of G. S. Ketchum mentioned above. 
For simplicity we consider only the special case when the coefficients of the 
functions h,(z) are combined two at a time. Let {k,} (v=1,2,---) bea se- 
quence of complex numbers such that 


lim sup | k,|*” < 1, 
so that A(z) =) is analytic in |z| <1. If 


fle) = 


we define 


(6.1) Fe) + = + 


then 
yun 


480 , R. P. BOAS [November 


By Hadamard’s multiplication theorem(), F(z) has no singularities inside 
the circle |z| <r. We have the representation 


1 
(5.2) F(z) = f(z) — (| 2| <r <r) 


where C is the circle | w| =r’ <r. 
Let us now consider the expansion of a given analytic function f(z) in 
terms of a set of functions 


gn(z) = 2*[1 + 
with /,(z) analytic in | z| <r and h,(0) =0. We introduce the functions 
(5.3) F(z) =L[f), Ga(z) = L[gn]; 
it is clear that we have G,(z) =2"[1+H,(z)], where the H,(z) are analytic 
in | z| <r and H,(0)=0. (It is not in general true that G,(z) e H2(r) if 


gn(z) © H2(r)). Let us suppose that the G,(z) satisfy one of the conditions 
of Theorems 4.1 and 4.2, so that we have in |z| <r a unique expansion 


F(z) > CnGn(2) 


n=0 


converging uniformly in any circle | z| Ss <r, with 
> | Cy f | F(se**) |2d0; 
0 


the last relation follows from the remark made at the end of §3. From (5.2) 
and (5.3) we then have 


1 1 
(5.4) fe) — ff AG/w)f(w)dw = J 


the series converging in |z| <r, uniformly in | sé s<r. If s is temporarily 


fixed, and we take p so that s<p<r, the series }»|c,| 2p?" is convergent, and 
the functions w~"g,(w) are uniformly bounded os w| =s. It follows, by an 
application of Cauchy’s inequality, that the series > _c,g,(w) is uniformly (and 
absolutely) convergent on |w| =s and so in |w| Ss. Since the series on the 
right of (5.4) is uniformly convergent in any circle |z| Ss <r, we have 


1 


in |z| <r, both series being uniformly convergent in any circle |z| <s<r. 
That is, the function f*(z) defined by 


(5) See, e.g., Dienes [24, p. 346]. 


EXPANSIONS OF ANALYTIC FUNCTIONS 481 


fre) = fe) = ote (6%. = 0) 


is analytic in |z| <r, and we have L[f*(z)]=0 in |z| <r. From (5.1) we see 
that this means that 


b* + = 0 
and hence that b*¥ =0 (v=0, 1, 2,--- ). That is, 
= ZL cnga(2), 
n=O 


the series converging uniformly in any circle | z| Ss<r. We sum up our con- 
clusions in a formal theorem. 


THEOREM 5.1. If the functions G,(z) =L[gn(z) |], where L is defined by (5.1), 
satisfy the conditions of Theorem 4.1 or Theorem 4.2, and s is defined as in those 
theorems, the functions g,(z) have Property T in | z| <s. 


For example, if the numbers k, satisfy 
lim sup | & |!” S 1; 
(n) 
h, (2) Yr 2 (| < ro), 


(n) (n) 
| + | B, 


ie) = > Bar <1), 
yenl 


and h(s) <1, then the functions 2"[1+h,(z)] have Property T in |2| <s. 
It is clear that linear operations other than that defined in (5.1) could also 
be used. 


6. Special expansion theorems. 


THEOREM 6.1. If o(z) is an analytic function whose Maclaurin series has 
positive coefficients(**), and radius of convergence R (RS @), if $(0) =1, and if 
the complex numbers a, satisfy 


(6.1) lim sup | | 1, sup | an| < R/¢-*(2), 
the functions 
(6.2) (n = 0,1, 2,--+) 
have Property T in any circle |z| Ss <-*(2). 


(**) That is, (2) is absolutely monotonic on the segment (0, R) of the real axis. 


1940} 


482 R. P. BOAS [November 


Here we have h,(z) =¢(a,z) —1; and, in the wnnetien of §4, 
H,(r) = Ar) S$ o(r) 1, 


and h(r) <1 if $(r) <2. 

In particular, we may have(!”) $(z) 

Various modifications of the situation considered in Theorem 6.1 are pos- 
sible. We shall discuss three which have interesting applications. 


THEOREM 6.2. The functions g,(z) defined by 


£0(2) = 1, 


(s) atthe, 
n\Z = 
an — Bn 


(lan| $1, | Bn] S 1, on Ba; m = 1, 2,---) 
have Property T in the circle | z| <r if r<log 2. 


It is easy to show that we may take h(r) =e’—1 in this case. For details, 
the reader is referred to the author’s note [22] where Theorem 6.2 is applied 
to show that an entire function of exponential type less than log 2 has an 
infinite number of derivatives which are univalent in the unit circle, unless 
it is a polynomial. 

For the next two theorems, it is necessary to go back to Theorem 2.2. 


THEOREM 6.3. The functions gn(z) defined by 
Ban(2) = 1), 
= 
have Property T in any circle |z| <r<0.780. 


We note that log 2 =0.693, 7/4 =0.785. We thus have more than Theorem 
6.1 would establish, but still less than the result which may be conjec- 
tured('*), that Theorem 6.1 holds, when $(z) for s<m/4. 

By Theorem 2.2, Theorem 6.3 will follow if we show that for every se- 
quence {a,} of complex numbers and for every N 


1 2r) N N 
(6.3) Dd anlgn(re) — < rr) >>| On r < 0.780, 
2rJo n=O 


with \(r)<1. The left-hand side, by the reasoning of Theorem 3.1, does not 
exceed 


(17) The corresponding theorem, with region of convergence | z| <1/e, was given by Ta- 
kenaka [15]. See also footnote 2. 
(}8) See footnote 21. 


EXPANSIONS OF ANALYTIC FUNCTIONS 
17% 2 
Jo n=0 
1 
= — 118] 
0 


1 
=— | — 1) |2d0 + | ¥(2)(e~* — 1) |*a9, 


1 


since |¥(z)| is periodic in 6 with period 7. Thus the left side of (6.3) does not 
exceed 


—{e —1)?+ max |e*— 


1 (N/2] 
= —1)*+ max |e*— DX | 
2 ont 
e have 
| — 1|? = con cos (y sin 6) + 1 


= A(6), 
say. Now 


A’(6) = 2r sin e~* %(e-7 08 — cos 7), 


and vanishes only when @=0 or when cos 0 = (1/r) log cos (1/r). In the latter 
case, exp (—?r cos 0) =cos 7, and 


A(6) = 1 — cos* r < 2(1 — cos r) = A(}m). 


Consequently A(@) assumes its maximum when @=0 or = 32. For r =0.780, 
we find that A(0) <A (47). In fact, this inequality is 


— + 1 < 2(1 — cos r) + cos r — 


which is equivalent to 


which is satisfied for r=0.780. Hence 
A(6) S A($r) = 2(1 — cos 7), 
and the left side of (6.3) does not exceed 


4{(e" — 1)? + 2(1 — cos r)} 


the brace is less than 2 when r =0.780. This completes the proof of Theorem 
6.3. 


1940} 483 


484 R. P. BOAS [November 


THEOREM 6.4. If r<log 2, and the complex numbers B, are such that 
(6.4) | Bn |2x-2" < 2er — 
n=0 


the functions g,(z) defined by 
(6.5) = — B,, (| < 1) 
have Property T in || <r. 


To apply Theorem 2.2, we need to show that if /,(z) =e**—1, then for 
all {a,} and N, 


1 2r| N N 
“ae = f an — Ba] [240 < | 
0 n=O 
with \(r) <1, when r and {8,} satisfy (6.4). 
The left side of (6.6) may be written in the form 
1 an | N 


and 4,2"h,(2) — | 


| 


= Si — S2+ Ss. 
Now, by the proof of Theorem 3.1, 


(6.7) Si (e" — 1)? 
and 
N 2 N N 
n=0 n=O 


by Cauchy’s inequality. Finally, 
1 


since h,(0) =0. Combining this with (6.7) and (6.8), we have 
1 2r) N 


Bn] ds {e 1)? + 2! Bn 2 an |2y2n, 


2rJo | 


1940] EXPANSIONS OF ANALYTIC FUNCTIONS 485 


Thus by Theorem 2.2 the system (6.5) has Property T if the brace in the last 
inequality is less than one. This will clearly be true if (6.4) is satisfied. 

7. Applications. We now use the theorems of §6 to prove theorems con- 
cerning the values taken by derivatives of entire functions of order one and 
exponential type. We need the following lemma. 


LemMa(!*), If f(z) is an entire function of exponential type k, it has the 
representation 


(7.1) = f (w)dw, 
where C is any circle |z| =k’ >k, and F(w) is analytic outside || =k. 
It follows that 


(7.2) = f we™F(w)dw (nm = 1,2,+++). 


If now the functions g,(w) have property T in | w| <k’, we can expand the 
function e*” in terms of them, substitute in (7.1), and integrate term by 
term(?*), We thus obtain an expansion of the form 


(7.3) f(z) = J 


We can now establish the following theorems. 
THEOREM 7.1. If f(z) is an entire function of exponential type k <log 2, and 
if f(0) =1, |an| $1, and r<k, the inequality 
(7.4) | > 2er — 


is valid. 


If (7.4) is not true, Theorem 6.4 applies, with 8B, =f” (a,), and (7.3) has 
the form 


fle) = cals) {f (atx) — Baf(0)} = 0, 


which is impossible since f(0) =1. 
As a corollary we obtain the following theorem of S. Takenaka [29]. 
THEOREM 7.2. If f(z) is an entire function of exponential type k <\log 2, 
and(*) |a,| $1, then 


(*) See Pélya [27, pp. 580 ff. ]. 
(*°) Cf. Whittaker [30, p. 67], Gelfond [25]. 
(**) Or even if lim sup | an| 31. 


486 R. P. BOAS [November 


(7.5) f™(an) = 0 (n = 0,1, 2,---) 
implies f(z) =0. 

For, if (0) #0, we consider f(z) /f(0), to which Theorem 7.1 applies, since 
the left side of (7.4) is zero if (7.5) is satisfied. If f(0) =0, while f(zo) #0 for 
some Z in |z| <1, we apply Theorem 7.1 to the function [f(s0)—f(z) ]/f(zo), 
taking ay = Zo. 

From Theorem 6.3, we obtain 


THEOREM 7.3. If f(z) is an entire function of exponential type k <0.780, and 
$1, then the conditions 


= f2(a,) = 0 = 0, 
imply that f(z) =0. 


This is more than follows from Theorem 7.2, but less than would follow 
if Theorem 7.2 were proved to be true with k<7/4, as has been conjec- 
tured (”*), 

Analogous theorems concerning functions analytic in a finite circle(*) can 
be proved by developing (w—z)-', as a function of w, in terms of (for example) 


(1 — 


and substituting the expansion into Cauchy’s integral formula. 


PAPERS ON GENERAL EXPANSION THEOREMS 


1. G. D. Birkhoff, Sur une généralisation de la série de Taylor, Comptes Rendus Heb- 
domadaires des Séances de 1’Académie des Sciences, Paris, vol. 163 (1917), pp. 942-945. 

2. R. F. Graesser, A certain general type of Neumann expansions and expansions in con- 
fluent hypergeometric functions, American Journal of Mathematics, vol. 49 (1927), pp. 577-597. 

3. S. Izumi, On the expansion of analytic function, Téhoku Mathematical Journal, vol. 28 
(1927), pp. 97-106. 

4. G. S. Ketchum, On certain generalizations of the Cauchy-Taylor expansion theory, these 
Transactions, vol. 40 (1936), pp. 208-224. 

5. P. W. Ketchum, Infinite systems of linear equations and expansions of analytic functions, 
Duke Mathematical Journal, vol. 4 (1938), pp. 668-677. 

6. T. Kubota, Eine Verallgemeinerung des Taylor-Cauchyschen Satzes, Tohoku Mathemati- 
cal Journal, vol. 22 (1923), pp. 336-347. 

7. S. Narumi, A theorem on the expansion of analytic functions, Téhoku Mathematical - 
Journal, vol. 30 (1929), pp. 441-444. 

8. Y. Okada, On a certain expansion of analytic function, Téhoku Mathematical Journal, 
vol. 22 (1923), pp. 325-335. 

9. S. Pincherle, Sopra alcuni sviluppi in serie per funzioni analitiche, Memorie della Reale 
Accademia delle Scienze dell’Istituto di Bologna, (4), vol. 3 (1881), pp. 151-180. 


(#) See Whittaker [30, p. 45], Schoenberg [28]. 
(#) See Takenaka [15, 29]; Whittaker [30, p. 43]. 


1940] EXPANSIONS OF ANALYTIC FUNCTIONS 487 


10. I. M. Sheffer, Concerning some methods of best approximation, and a theorem of Birk- 
hoff, American Journal of Mathematics, vol. 57 (1935), pp. 587-614. 

11. S. Takahashi, On the expansion of analytic function, Proceedings of the Imperial Acad- 
emy, Tokyo, vol. 6 (1930), pp. 389-392. 

12. , A remark on Mr. D. V. Widder's theorem, Tdhoku Mathematical Journal, 
vol. 33 (1930), pp. 48-54. 

13. , On the expansion of analytic functions, Téhoku Mathematical Journal, vol. 35 
(1932), pp. 242-243. 

14. S. Takenaka, A generalization of Taylor's series, Japanese Journal of Mathematics, 
vol. 7 (1930), pp. 187-198. ' 

15. ———, On the expansion of analytic functions in series of analytic functions and its 
application to the study of the distribution of zero points of the derivatives of analytic functions, 
Nippon Stgaku-Buturigakkwai Kizi (Proceedings of the Physico-Mathematical Society of 
Japan), (3), vol. 13 (1931), pp. 111-132. 

16. L. Tonelli, Sulle serie di funzioni analitiche della forma }_an(x)x", Annali di Matematica 
Pura ed Applicata, (3), vol. 18 (1911), pp. 99-103. 

17. J. L. Walsh, On the expansion of analytic functions in series of polynomials, these Trans- 
actions, vol. 26 (1924), pp. 155-170. 

18. , On the expansion of analytic functions in series of polynomials and in series of 
other analytic functions, these Transactions, vol. 30 (1928), pp. 307-332. 

19. , Note on the expansion of analytic functions in series of polynomials and in 
series of other analytic functions, these Transactions, vol. 31 (1929), pp. 53-57. 

20. D. V. Widder, On the expansion of analytic functions of a complex variable in generalized 
Taylor’s series, these Transactions, vol. 31 (1929), pp. 43-52. 


OTHER REFERENCES 


21. S. Banach, Théorie des Opérations Linéaires, 1932. 

22. R. P. Boas, Jr., Univalent derivatives of entire functions, Duke Mathematical Journal, 
vol. 6 (1940), pp. 719-721. 

23. H. F. Bohnenblust and A. Sobczyk, Extensions of functionals on complex linear spaces, 
Bulletin of the American Mathematical Society, vol. 44 (1938), pp. 91-93. 

24. P. Dienes, The Taylor Series, An Introduction to the Theory of Functions of a Complex 
Variable, 1931. 

25. A. Gelfond, Interpolation et unicité des fonctions entiéres, Matematicheskii Sbornik 
(Recueil Mathématique), new series, vol. 4 (1938), pp. 115-147. 

26. R. E. A. C. Paley and N. Wiener, Fourier Transforms in the Complex Domain, Ameri- 
can Mathematical Society Colloquium Publications, vol. 19, 1934. 

27. G. Pélya, Untersuchungen tuber Liicken und Singularititen von Potenzreihen, Mathe- 
matische Zeitschrift, vol. 29 (1929), pp. 549-640. 

28. I. J. Schoenberg, On the zeros of successive derivatives of integral functions, these Trans- 
actions, vol. 40 (1936), pp. 12-23. 

29. S. Takenaka, On the expansion of integral transcendental functions in generalized 
Taylor's series, Nippon Sigaku-Buturigakkwai Kizi (Proceedings of the Physico-Mathematical 
Society of Japan), (3), vol. 14 (1932), pp. 529-542. 

30. J. M. Whittaker, Interpolatory Function Theory, 1935. 

31. A. Zygmund, Trigonometrical Series, 1935. 

32. I. I. Ibragimoff (I. Ibraguimoff), Sur quelques systémes complets de fonctions analytiques 
(in Russian), Izvestiya Akademii Nauk SSSR, Seriya Matematicheskaya (Bulletin de 
l’Académie des Sciences de l’URSS, Série Mathématique), 1939, pp. 553-567; French sum- 
mary, pp. 567-568. 


DuKE UNIVFERsITy, 
Duruas, N. C. 


ON THE INTEGRO-DIFFERENTIAL EQUATIONS OF 
PURELY DISCONTINUOUS MARKOFF PROCESSES 


BY 
WILLY FELLER 


1. Introduction. In the following we are concerned with stochastic proc- 
esses depending on a continuous time parameter /, that is to say, with some 
entity (chance variable) whose state is specified by a point X(t) varying in 
some space E according to some probability law. The process is called a 
Markoff process(*) if the probability distribution of X(t) is completely deter- 
mined for all t>7 by the knowledge of the state X(r),; and in particular is 
independent of the development of the process for t<7(*). Analytically a 
Markoff process is completely determined by its transition probabilities 
P(r, x; t, A), giving the conditional probability of X(t)’s being contained, 
at the moment /, in the set A ¢ £ under the hypothesis that at a fixed moment 
7 <t the state X(r) coincided with the point x of E. 

In strict terms, we shall suppose that there is specified, in the space E, 
a Borel field 8 of sets (on which probabilities are defined) such that E e 8 and 
also any set consisting of a single point belongs to %. It is then required that 
P(r, x;t, A) is, for fixed r,t >7 and x e E, a non-negative and completely addi- 


tive function of sets on %, with 


(1) P(r, x; t, E) = 1. 


Moreover, we shall always assume that for fixed 7, t, A the function P(r, x;t, A) 
is measurable with respect to %, that is to say, that for any a >0 the set where 
P(r, x; t, A) <a belongs to ¥. Finally we shall, for the sake of simplicity, re- 
strict ourselves to P(r, x; t, A) depending, for fixed other arguments, continu- 
ously on both 7 and ¢(*). This implies in particular that as either t—r+0 or 
—0 


Presented to the Society, February 24, 1940; received by the editors March 5, 1940. 

(‘) This name was chosen in accordance with the now common terminology in the case of 
processes with an integral-valued parameter ¢. Kolmogoroff [6] calls such processes stochasti- 
cally definite, and this terminology I had also adopted previously. Markoff processes are, some 
times, also described as being “without after effect,” or as being submitted to an “influence 
globale” (Pélya). 

(?) This is, of course, not meant to be a strict definition; as a matter of fact, we shall be 
concerned only with the function P(r, x; ¢, A), which will be defined purely analytically. 

(*) It may be pointed out that this does not imply the continuity of the movement of X(t). 
We shall, on the contrary, be concerned only with states X(¢) changing abruptly by jumps. 
The continuity of P(r, x; ¢, A) means that the probability of a jump during a small time-inter- 
val is small. 


488 


DISCONTINUOUS MARKOFF PROCESSES 


1 if wveA, 


(2) P(r, x; t, A) 8(x, A) = 


Subdividing now the interval (7, ¢) by a point s and considering all possible 
states X(s), we readily get the identity 


(3) P(r, x; t, A) = JP x; s, dE,)P(s, y; t, A) 


known as the equation of Chapman and Kolmogoroff(*). We shall take these 
relations as the analytic definition of a Markoff process and consider any 
P(r, x;t, A) of the described sort as defining the transition probabilities of sucha 
process(*). 

In the special case of the space E containing at most an enumerable num- 
ber of points, we shall denote these points by x; and write 


(4) P(r, Xi; = Pix(r, t). 


By (1) we have >..P (7, t) =1, while (2) and (3) are respectively equivalent to 


(5) t) (t +0) 


and 
(6) Pix(t, t) = s)P t) (r t). 
i 


Now a purely discontinuous process may be described by the following 
property: if, at the moment ¢, the actual state is given by the point x, then 
there is a probability 1— p(t, x)At+-o(At) that no change of state will occur 
during (¢, +-At>1#); and if a change occurs, the probability of X(t+At)’s being 
contained in the set A is given, except for terms of 0(1), by a probability dis- 
tribution I(t, x, A)(*). In strict terms we shall say that the Markoff process 


(*) Cf. Kolmogoroff [6] where the foundations of the general theory of Markoff processes 
have been laid. 

(*) This is the natural point of view for the purposes of the present paper. From an axiomat- 
ical point of view, however, any stochastic process corresponds to a measure in the space of real 
functions defined on E. Even in the case of Markoff processes there are problems (especially 
the problem of ruin, playing an important réle in the theory of risk) which require a deeper 
penetration in the theory of the functional space. For the treatment of stochastic processes in 
terms of measure, the reader is referred to J. C. Doob’s fundamental paper [1]. 

(*) Essentially this definition was given by Feller [3]; cf. also Dubrovski [2]. This kind 
of processes was mentioned also by Kolmogoroff [6]. 

Examples of such processes are furnished by the theory of radioactive processes and the 
theory of automatic telephone offices; by the transport of stones by rivers (treated by quite 
different methods of Pélya [8]); by the mathematical theory of struggle for life (Feller [4]), 
etc. Perhaps the most important application is furnished by the theory of risk. 

There is no general definition for “purely continuous” processes in abstract Z. In the Euclid- 
ean space such processes were defined by Kolmogoroff [6] and somewhat more generally by 


489 


490 WILLY FELLER 


defined by P(r, x; t, A) is purely discontinuous if for small t—r>0 
P(r, x; t, A) = {1 — p(t, x)(t — 1) A) 
+ p(t, x)(t r)II(t, x, A) + o(t 7), 


where 6(x, A) was defined by (2) and the exact assumptions as to p(t, x) and 
I(t, x, A) will be specified in §2, (i)—(ii); in general, 0(t—7) will depend on x 
and A. 

The main problem with which we are confronted is to determine whether 
or not to any two functions p(t, x) and II(¢, x, A), subjected only to the condi- 
tions §2, (i)—(ii), there corresponds a Markoff process, whose transition prob- 
abilities P(r, x; t, A) satisfy (7); and if so, whether this process is uniquely 
determined. 

It will be shown (§2) that there is a subclass Qi, of sets A e B such that for 
all A © Bi, all +, and almost all t the partial derivatives OP(r, x; t, A)/dt and 
OP(r, x; t, A)/Or exist; for all those A, all t and almost all t, the integro-differ- 
ential equations 


OP(r, x; t, A) 
ot 


(7) 


=— f p(t, y) P(r, x; t, dEy) 
(8) : 
+ f 9, MG, », APC, 1, 
E 
and 
OP(r, x; t, A) 


hold, implying the existence of the integrals for all A e 8;, and almost all ¢. The 
class 8; contains, among others, sequences of sets A, f E. 

Thus the problem is reduced to the integration of (8)—(9). It will be shown 
(§§3—-5) that there is a function P(r, x;t, A) which satisfies (8)—(9) for allAe Bi 
and almost all t and 7; this P(r, x; t, A) is uniquely determined by each of the 
equations (8)—(9)(") and has all properties described above, except perhaps (1); 
one has always 


(10) 0 Ss P(r, x; t, A) S 1, 


but there are cases with 


Feller [3]. This type is illustrated by the diffusion processes: there is a probability equal to 1 
that some change of X(t) will occur during any time-interval, but the chance is near to 1 that 
the variation will be, in a specified sense, small for small intervals. This type is described by 
partial differential equations of parabolic type. There is also 2 “mixed type” leading to the 
equation (15) and its adjoint. 

(7) It should be understood that, in general, a solution of (8) is not uniquely determined by 
the initial values (2), not even in the case of enumerable spaces. The uniqueness mentioned is 
a consequence of the additional hypothesis that 0S P(r, x; t, A) $1. 


[November 
, 


1940] DISCONTINUOUS MARKOFF PROCESSES 


(11) P(r, x; t, E) < 1. 


This exceptional case arises only if p(t, x) is unbounded (cf. Theorem 6), but 
can occur also in the case of enumerable spaces E. 

The existence of positive bounded solutions that conform with all other 
requirements of the theory, including (3), but still fail to be distribution func- 
tions, is most striking, and an analysis of this phenomenon was the primary 
object, and constitutes the most delicate part, of the present investigation. 
In the case of temporally homogeneous processes, that is, in the case of p(t, x) 
and II(¢, x, A) not depending on t, we give in §6 a necessary and sufficient con- 
dition that the solution P(r, x; t, A) be a proper probability distribution, that is 
to say, that (1) holds. This condition is rather complicated, but can be inter- 
preted in terms of the ergodic properties of the system; and it shows in par- 
ticular that the exceptional case (11) can arise only in highly dissipative 
systems. The simplest example for the phenomenon will be given in §7. 

In the case of an enumerable space E we write corresponding to (4) 

(12) P(t, = pit), T(t, wi, ve) = 
Equations (8) and (9) are then equivalent with 

P; T; 
(13) = — pil) Pir, t) + Pi (7, 
and 


OP ix(7, t) 


In this case the condition (7) is obviously only a regularity restriction, and 
there exists only the type of purely discontinuous processes. It follows from 
the results of the present paper that (7) implies the existence of 0P;(r, t) /dt for 
almost all ¢, and hence also the convergence of the sum in (13) for almost all ¢. 
However, this sum may diverge for special values of ¢. It is easy to impose on 
p.(t) and I(t) further restrictions ensuring the convergence of the sum in (13) 
for all ¢ (cf. §2, (23)). . 

Equations (13)—(14) were derived by Kolmogoroff [6] under some slight 
additional hypothesis on the passage to the limit in (7). The case of finitely 
many x; was dealt with by several different methods: a full account of them 
is to be found in Fréchet’s treatise [5]. In the case of infinitely many x;, the 
first attempt was made by Kolmogoroff, who found a sufficient condition for 
the existence of a solution of (13) with the initial condition (5)(*). From the 
results of the present paper it readily follows, however, that Kolmogoroff's. 
solution is not necessarily a probability distribution, since it is possible that 

(*) Kolmogoroff [6], §10. The usual notation is: —p¢(t) = Ass(t), = With 


this notation Kolmogoroff’s condition requires that, putting BY” =1 and Aj|, 
all By” exist and that >_.B,’x"/n! converges for some x >0 and all k. 


492 WILLY FELLER ; [November 


Pu <1. On the other hand, Kolmogoroff’s assumptions are rather restric- 
tive. 

The case of E’s being the real axis or any Borel set on it was dealt with 
by Feller [3]. Equations (8)—(9) were, however, derived from (7) under addi- 
tional hypothesis, and integrated only in the case of a bounded p(t, x) (in 
which case (11) cannot occur). It may be pointed out that this covers also 
the special case of enumerable Z in the case of bounded coefficient in (13)- 
(14)(*). As Dubrowski [2] has shown, Feller’s results and proofs can be trans- 
ferred almost literally to the case of an arbitrary abstract space E(*), 

The present method of dealing with equations (8)—(9) is more general than 
that used loc. cit. [2, 3], but affords at the same time a considerable simplifi- 
cation. The same simplification can be made in the treatment of the more 
general integro-differential equation of parabolic type: 

2 
OP(r, x; t, A) alee P. P(r, x; t, A) + b(n, 2) OP(r, x; t, A) 


Or Ox? 
(15) 


= PCr, x;t, A) — J Pe. y; t, A)I(r, x, az, 


where £ is the real axis, a(r, x) >0. This equation and its adjoint describe the 
mixed type of a Markoff process("). 

2. Preliminaries. The following assumptions on p(t, x) and II(t, x, A) will 
be made throughout the paper: 

(i) p(t, x) is finite and non-negative for all points x of E and all ¢ of some 
finite or infinite interval T)<t<T7\. For x fixed, p(t, x) is a continuous func- 
tion of t, and for ¢ fixed it is measurable with respect to 8. 

(ii) I(t, x, A) is defined for T7)<t<T7;, for all x e E and all sets A e B. For 
fixed x, A it is a continuous function of ¢; for fixed ¢, A it is measurable with 
respect to %, and for fixed ¢, x it is a non-negative completely additive func- 


(*) It suffices namely to interpret the points x, as integers. It was with a view of this case 
that p(t, x) was, in [3], supposed only to be megsurable with respect tox. The point was not, 
however, mentioned explicitly and seems to have been generally overlooked. 

(#°) Added in proof: In a recent paper [9] (which became accessible to the author only 
after the present paper was submitted for publication), W. Doeblin investigated essentially the 
same class of stochastic processes with which we are concerned here. It may be remarked that 
Doeblin’s methods as well as his results are different from ours. He proceeds by a direct and 
careful analysis of the stochastic movement itself, and arrives at a characterization of the proc- ° 
ess by means of two functions U(r, t, x) and V(r, x; t, A) which may, roughly, be described, 
respectively, as the probability that the moving point X(#) will remain in its initial position x 
during (r, ¢), and the compound probability that it will undergo a change such that the first 
jump takes it into the set A. These functions must satisfy the functional equations U(r, t, x) 
= U(r, s, x),U(s, t, x) and V(r, x; t, A) = V(r, x; s, A)+U(r, s, x) V(s, x; t, A) for r<s<t. It is 
shown that except for these equations and some trivial additional restrictions the functions U 
and V can be prescribed arbitrarily. The occurrence of the exceptional case (11) is ruled out by 
a uniformity condition. 

(") For the definition see §2 and for the integration of (15), §5 of Feller [3]. 


1940} DISCONTINUOUS MARKOFF PROCESSES 


tion of sets A e 8 with 
(16) I(t, x, EZ) = 1. 
Finally, for the set A=x we suppose that 
(17) x, x) = 0. 
Throughout this paper the parameters ¢ and 7 are restricted so that 
To<7r<t<T, 


where (To, 71) is the interval specified above. x, y, 2 will denote points of EZ. 
Any function of points will be supposed, or is easily seen to be, measurable 
with respect to B. A set A e B will be called bounded if p(t, x) is uniformly 
bounded for all t and x e A. In particular, we shall write 


(18) Aa = E{ p(t, x) <a}, 


where a>0. By (i) obviously A, e 8 and A,? Easaf . Any finite set is 
bounded, and in the case of an enumerable E it is more convenient to consider 
finite sets instead of bounded. A similar remark applies if Z is equipped with 
a metric. 

By (i) and (ii) integrals of the type J(t, x, A) = fap(t, y)II(t, x, dEy) have 
a meaning; if, in particular, A is a bounded set, J(t, x,A) is for fixed x a 
continuous function of ¢, and for fixed ¢ a function of x which is measurable 
with respect to 8. Now any set A ¢ & is the limit of an increasing sequence 
of bounded sets, and hence any function of the type J(t, x, A) is the limit of a 
monotonic sequence of functions which are, for fixed other arguments, con- 
tinuous with respect to ¢ and measurable with respect to 8. This remark ap- 
plies to all integrals which will be used in the sequel, and enables us in par- 
ticular to use repeated integrals. We shall also frequently have to interchange 
the order of integration. To legitimate this procedure once for all the follow- 
ing may be remarked. 

Only two different types of inversions will be used. Sometimes both in- 
tegrations will be with respect to time-parameters: in such cases the elemen- 
tary theory of repeated integrals will suffice to justify the change in the order 
of integration. In all other cases the inversion will be based on the following 


LemMMA(??), Let E™ and E be two spaces, and let 6‘ be a Borel field of sub- 
sets of E‘®,i=1, 2. Denote by x‘” a point varying in E\, and by A‘ a set belong- 
ing to B*, Let f(x™) and g(x) be two non-negative and bounded functions, 
measurable with respect to B and to B, respectively. Let F(A) be a com- 
pletely additive function of sets with OS F(A™) Finally, let 
G(x, A®) be defined for all x &¢« E™ and A® & B® so that it is, for fixed x, 

(32) Added in proof: A similar theorem for the case of the real axis was announced by R. H. 


Cameron and W. T. Martin, but is not yet published; see the abstract presented to the American 
Mathematical Society, 46-3-162. 


494 WILLY FELLER [November 


a completely additive function of sets A, and for fixed A measurable with re- 
spect to 8, and so that for all values of the arguments 0S G(x, A®) $1. Then 
for any two fixed sets T ¢ B and T e B® 

2) 


r@) 


Before proving this theorem let us remark that it is much simpler than 
Fubini’s theorem, but is not contained in it. In our applications either both 
spaces E‘ will coincide with E, or else E“ will be the real time-axis and E‘ 
the space E. It is clear that 


is, as a function of x‘, measurable with respect to 8, since it is the limit of 
measurable functions. Similarly 


(2) (1) 


is a completely additive function of sets A‘ e 8. Thus both sides in (19) 
have a meaning. 

The lemma is easily proved by a decomposition I’ =>, where Pr 
is the set of all points x‘ where (n —1)eS g(x) <ne. Since g(x) is bounded, 
only finitely many I are not empty. Hence 


rt) 


and the last integral is bounded. This proves (19) with the sign = instead of 
the equality. In the same way, however, we get also the opposite limitation, 
and this accomplishes the proof. 

A word has still to be said about the derivation of the equations (8) and 
(9) and the relations between them, though this is by no means necessary for 


1940] DISCONTINUOUS MARKOFF PROCESSES 495 


the understanding of the following existence theorems. Accordingly, the 
reader can pass over directly to §3. 

Equation (8) is more natural than (9) since, roughly speaking, (9) de- 
scribes the process in its dependence on the initial values. (8) leads also toa 
representation of P(r, x; t, A) which is in most cases more useful than the 
representation deduced from (9). The later equation is, nevertheless, simpler 
than (8) since the integrals in (9) converge for all sets A e § and a derivative 
0P/dr exists for all r whereas the integrals in (8) will, in general, converge 
only for bounded sets and almost all ¢, so that also 0P/dt exists only for 
bounded sets and almost all ¢. 

In previous papers() the equations (8) and (9) were derived under the 
assumption that the passage to the limit in (7) takes place uniformly with 
respect to x. For a general theory, however, such an assumption is not only 
an unnecessary restriction, but is also dangerous since it can be shown by ex- 
amples that it is not realized for the actual solutions("). 

To deduce (9) we observe that by (3) we have for Ar>0 


P(r — Ar, x; t, A) = f P(r — Ar, x; 7, dEy)P(r, y; t, A) 
or, splitting the space of integration into x and E—x, 
1 
-— { P(r — Ar, x; t, A) — P(r, x; t, A)} 
Ar 


(r — Ar, x37, x) —1 
— Ar 


P 
= P(r, x; t, A) 


1 
-— P(r, y; t, A)P(r — Ar, x; 7, dEy). 
Ar E-z 
Now by (7) and (17) 
1 
{ P(r — Ar, x37, x) 1} — pr, x), 
Ar 


and using (16) and (17) it is seen that also 


1 
— P(r — Ar, x; 7, E — x) p(r, x). 
Ar 


Hence, for fixed 7, x, the ratio P(r — Ar, x; 7, A)/Ar is uniformly bounded for 
all sets A not containing x, and by (7) this quantity tends to p(r, x)II(r, x, A). 
(4) Feller [3], Dubrovski [2]. 
(4) It may be remarked that the occurrence of solutions satisfying (11) has nothing what- 


soever to do with the nonuniformity of the passage to the limit in (7) (or with the circumstance 
that the derivatives of P(r, x; t, A) are not bounded). 


496 WILLY FELLER : [November 


The right-hand member of (20) is thus seen to tend to the limit given by (9), 
and it follows from (20) that a left-hand derivative 0P(r, x; t, A) /0r exists for 
all 7, x, t, A and that with this derivative (9) holds. The actual (and unique) 
solution of (9) will show that this left-hand derivative actually is the deriva- 
tive in the usual sense. ; 

It seems impossible to give a strict proof also for (8) in an equally simple 
way. One can easily render (8) plausible by writing, according to (3), 


{ P(r, x; At, A) — P(r, x; t, A)} 
(21) 
= a; t, dEy){ P(t, y; At, A) — 8(y, A)}/dt, 


and going formally to the limit applying (7). In a strict sense, however, one 
gets by this procedure only a partial result. Denote namely by 9%’ the class 
of sets such that A e 8’ if, and only if, A is bounded and there is a constant 
a>0 such that 


1 — P(t, x; t+ At, x) P(t, x; t + At, E — x) 
< 4, <a 
At At 


(22) 


for all t, all x e A, and all At>0. Denoting then by D, the upper right-hand 
derivative with respect -to ¢t, it follows easily from (21) and (22) that for all 
sets A e B’ 


D,P(r, x; t, A) 2 — f p(t, y) P(r, x; t, dEy) 
A 


+ f p(t, y, A)P(r, x; t, dEy). 
B 


Now here the first integral converges, since A is a bounded set. Thus for fixed 
A e 8’ and for fixed x, D.P(r, x; t, A) is uniformly bounded from below; and 
D,.P(r, x; t, A) = © for all values of ¢ for which the second integral diverges. 
Since 0S P(r, x; t, A) $1 it follows that, for A e B’, the second integral must 
converge for almost all ¢, that it is to say, that a finite right-hand derivative 
OP(r, x; t, A) /dt exists for all 7, A e B’, and almost all ¢, and furthermore that 
with this derivative 


OP(r, x; t, A) 


=- y) P(r, x; t, dEy) + p(t, y, A) P(r, t, dEy). 


Actually the sign of equality in (8) holds not only for all A e 8’ but even for 
all bounded sets and almost all ¢. For the,sake of simplicity we prefer, how- 
ever, to prove this assertion in an indirect way: 


q 


1940] DISCONTINUOUS MARKOFF PROCESSES 497 


We shall namely prove that there is (under the assumptions (i)-(ii) on 
p(t, x) and II(t, x, A)) one and only one function P(r, x; t, A) satisfying (9) 
with the initial condition (2) and which is, for fixed r, x, t a completely additive 
function of sets B with OSP(r, x; t, A) $1. For bounded sets A this function 
will be shown to be an absolutely continuous function of t, satisfying (8) for al- 
most allt. 

Moreover, it will be shown that with this solution (3) also holds. This gives 
a uniqueness theorem for our general problem, but an existence theorem will 
be given essentially only for uniformly bounded p(t, x) (cf. Theorem 6), since 
sometimes instead of (1) only (11) holds. 

This result shows in particular that we may use, instead of the class 8’ 
considered above, the class %, of all bounded sets. This by itself does not 
imply that (22) holds for any bounded set and some suitable a. It may, how- 
ever, be remarked that this is actually the case, as is readily seen from the 
representation of the solution given below. 

It may still be pointed out that it can be shown by examples that not even 
for bounded sets A does the derivative 0P(r, x; t, A)/dt need to exist for all ¢. 
It is, however, easy to make additional assumptions on p(t, x) which assure 
the existence of 0P(r, x; t, A) /dt for all bounded sets and all ¢. Such a hypothe- 
sis is, for instance, that 


P(t, x) 
1 + p(te, x) 
uniformly for all values x, t;, 4. This hypothesis is in particular fulfilled in the 
case of temporally homogeneous processes. 


3. Solution of (8). We shall define a new completely additive function of 
sets A e B by 


(23) 


(24) II*(r, x; #, A) = few {- x, dEy). 


Obviously 0 SII*(r, x; ¢, A) $1; furthermore for any bounded set A 
OII*(r, x; t, A) 
ot 


(25) 
y)II*(r, x; t, dE,). 


THEOREM 1, Put(*) 
(26) P(r, x; t, A) = 6(x, A) exp {- f 


() 8(x, A) was defined by (2). 


498 WILLY FELLER [November 


and forn21 


(27) P(r, A) = f do y)II*(o, y; t, (7, x; 0, dEy). 
T 


Let 
(28) P(r, x; t, A) P(r, x; t, A); 
n=0 


(i) the function P(r, x; t, A) is for fixed r, t,x © E a completely additive function 
of sets Ne B with OS P(r, x; t, A) S1; (ii) P(r, x; t, A) is for fixed r, x, A an ab- 
solutely continuous function of t; for any bounded A and almost all t the derivative 
OP /dt is finite and satisfies (8) with the initial condition (2). 


Remark. It will be seen that for any bounded A and almost all ¢ 
OP (7, t, A) 
ot 


(r, x; t, A) 
f p(t, y) x; t, dEy) 
A 


= — p(t, x) P(r, x; t, A), 


+ f p(t, y, x; t, dE,); 
E 


the integrals on the right side converging for almost all ¢. These equations are 
in close analogy with (8), and afford the interpretation of P(r, x; t, A) as 
the compound probability that during (r, t) the state X will change by exactly 
n jumps and that X(t) e A, if it is known that X(r) =x. 

In the special case of an enumerable E it is, of course, sufficient to deter- 
mine the quantities P(r, 4). For these, (29) reduces to the ordinary differ- 
ential equations 

(r, t) 
Ot 


= — (r,t) + pM nl) (r, 
i 


and (26)-(27) to 


If the p;(a) are not subjected to a further restriction analogous to (23), the 
derivative 0P%(r7, t)/dt will exist only for almost all ¢. 


— 2), 


| 
(30) 


1940] DISCONTINUOUS MARKOFF PROCESSES 499 


Proof. Suppose, by induction, that P‘)(r, x; ¢, A) exists for some fixed 
n=0, and all values of the arguments, and that it is a completely additive 
function of sets A e 8 with 0S P(r, x; t, A) $1; furthermore that 


t 
(31) L™ (7, x,t) = f ao f p(o, y)P\™ (1, x; 0, dE,) 
is finite. This is certainly true for »=0 and 
(32) x; t, + x, = 1. 


It follows then from (27) that also P‘"+» (7, x;t, A) exists and 0S P("+")(r, x;¢, A) 
<L(r, x, t). For any bounded set A, therefore, we get from (25) and (27) 


f P(t, t, dEy) 
A 


OIl*(o, y; t, A) 
= -f ao f p(o, y) P(r, x; 0, dEy); 
E 


the left-hand member is obviously a continuous function of t, and we get 


t 
f doy f y) P(r, 01, dEy) 
A 
y;o1,A 
-f dos f do P(e, y) ) x; 0, dE,); 
001 


inverting the order of integration and observing that II*(¢, y; ¢, A)= 
II(o, y, A), we get 


f dos P(or, y) (7, x; 01, dEy) 
A 
-f ao f y, (7, x; 0, dEy) 
E 


t 
or by (27) finally 


2; t, A) +f f y) (r, dEy) 
A 


(33) 
= f do J 26. A)P™ (r, dE,). 


This is, essentially, the relation (29) of the remark following Theorem 1. 
Equation (33) holds for any bounded set A. We apply (33) in particular to 


500 WILLY FELLER [November 


A=A, (see (18)) and let a7 «©. Since II(c, y, A.) $1 the right-hand member 
is bounded by Lr, x, t) (cf. (31)). Hence we get 


(34) x: t, E) + x, = x, 


It is thus seen that both P+ (r, x; t, A) and L‘+(r, x, t) exist. Moreover, 
since P+) (7, x; t, A) 20, we have L“+(r, x, t) SL™(r, x, t). Thus, for all 
n2=0, 


(35) 12] x,t) = (7, 2,1) 2 x, 1) > 2, 


By (34) and (35) also P+ (r, x; t, A) $1, and thus the assumptions of the 
inductive argument hold for all 1. 

It may be remarked that it can be shown by examples that the integrand 
of (31), fep(o, y) P(r, x; 0, dEy), sometimes diverges for some values of ¢. 
The proof shows, however, that it converges for almost all ¢, and it is readily 
seen that it converges for all o if (23) holds. 

Now we get from (34) and (32) 


N 
(36) > P(r, x; t, = 1 — L(r, x, 


n=0 


and thus 0S P(r, x;t, A) 
Hence we readily deduce from (33) for any bounded set A e B 


P(r, x; t, A) + fae f y) P(r, x; 0, dEy) 
(37) A 


-f do y)II(o, y, A)P(r, x; 0, dEy); 


this proves (8) for almost all ¢. If (23) holds, (37) can obviously be differ- 
entiated for all ¢ and (8) holds for any bounded set and all ¢. 
From (35) and (37) we get also the following 


Coro.uary. The necessary and sufficient condition that P(r, x; t, E) =1 for 
all t is that 


(38) L(r, x, = lim x, 1) = 0 
N-w 
for all t. 
Incidentally, it is quite obvious that (7) holds at least for all bounded 
sets A, since by (26) and (27) 


lim & { A) — P(r, a;7 +h, A)} = p(r, x)6(x, A), 
h 


lim P(r, x; 7 + h, A) = p(r, x, A) 


1940] DISCONTINUOUS MARKOFF PROCESSES 


and by (10) and (16) 


1 
lim sup — P(r, x37 + h, E) 
h— +0 h nm? 


1 
< lim — {1 — P(r, x37 + h, E) — P(r, x37 + h, BED} 
h—+0 h 


= 0. 


That (7) holds for any A e B will be proved in §5. 
4. Solution of (9). We now prove the following theorem. 


THEOREM 2. Put 
(39) Q(r, x; t, A) = 5(x, A) exp {- f p(s, spas ‘ 
and forn2=1 


QO) (r, x; t, A) -f p(o, x) 
(40) 


exp {- f p(s, dash de f y; t, A)II(o, x, dE,). 
Then("*), for any fixed 7, x, t, 
(41) P(r, x; t, A) = x; t, A) 
is a completely additive function of sets Ae B and OS P(r, x; t, A) Further- 
more P(r, x;t, A) is a solution of (9) with the initial values given by (2). 


Remark. Obviously the Q‘(r, x; t, A) are solutions of the equations 
a; t, A) 
or 
x; A) 
Or 


P(r, x; t, A), 


(42) = Hr, 2548) 


fore, y; t, A)I(r, x, 


which can be treated as ordinary differential equations. 
Proof. Put 


S™(r, x; t, A) = > x; t, A). 


k=O 


(#8) It will be seen (Theorem 4) that the functions defined by (41) and (28) are actually 
identical. 


502 WILLY FELLER [November 


Then 05S(r, x; A) $1. Let us suppose that OS SOSTSMS 
<1. Then by (39) and (40) 


S(™(r, 4, A) = exp A) +f 9%, x) 


(43) exp { p(s, ash do y; t, A)II(o, x, az, 


exp {- {1 
+f x) exp { ao} we 


On the other hand obviously S™(r, x; ¢, A)2S“-(r, x; t, A). Hence 
S(r, x; t, A) T P(r, x; t, A)S1. That P(r, x; t, A) is a solution of (9) fol- 
lows immediately from (42), and also the initial condition (2) is obviously 
satisfied. 

THEOREM 3 (Uniqueness theorem(!")). Consider some fixed t and a function 
P*(r, x; t, A) which (i) for fixed r, x, t is a completely additive function of sets 
Ae S with OSP*(r, x; t, A) $1; (ii) for fixed x, t, A is an absolutely continuous 
function of r satisfying for almost all r the equation (9) with the initial value (2) 
as —0. Then P*(r, x; t, A) = P(r, x; t, A), where P(r, x; t, A) is the function 
defined by Theorem 2. 


Proof. (i) We first show that 
(44) P*(r, x; t, A) & P(r, x; t, A); 


this remains true also if the assumption P*(r, x; t, A) $1 be replaced by the 
weaker one that P*(r, x; t, A) is uniformly bounded. In fact, treating (9) as 
an ordinary differential equation, we get by (2) 


(45) r(s, A) + f (0, x) 


exp {- a)as\ de y; t, x, az,)} 


Since the last term is non-negative, we see by comparison of (45) with (39) 
that P*(r, x; t, A) >Q(r, x; t, A) = S (7, x; t, A). Comparing, then, (45) with 
(43) it is readily seen that P*(r, x; t, A) 2S (r, x; t, A) for any n, which 
proves (44). 

(ii) Put 


Cf. footnote 7. 


1940] DISCONTINUOUS MARKOFF PROCESSES 503 
D(r, x; t, A) = P*(r, x; t, A) — P(r, x; t, A). 


By (44), D(r, x; t, A) is a completely additive function of sets, and 
OsD(r, x; t, A)S1. Now, assuming that D(r, x; t, E) takes on the value 
a>0 somewhere, denote by 79 the least upper bound of all 7 for which 
D(r, x; t, E) 2a, so that 


(47) D(r, x; t, E) < D(ro, x; t, E) =a forto<7<t. 


Now (1/a)D(r, x; t, A) is a solution of (9), which vanishes as r—t—0. Hence 


1 t 
— x; t, E) -f p(o, x) 


exp {- do De, y; t, x, dEy). 


Combining (47) and (48), we get 


= 1 — exp {- f <1. 


Thus the assumption D(r, x; t, Z) =a>0 leads to a contradiction. Hence, by 
(44) and (46), D(r, x; t, A) =0 for all sets A, and this accomplishes the proof. 
5. Properties of the solutions. We now prove 


THEOREM 4. (i) With the functions defined by Theorems 1 and 2 one has 
(49) x; t, A) = x; t, A) 


identically; thus equations (28) and (41) define the same function P(r, x; t, A). 
(ii) This function satisfies the fundamental assumption (7). 


Proof. (i) Put P(*+(r, x; t, A)= AP (r, x; t, A), where A is a linear 
operator on A); and similarly (7, x; t, A) BQ (r, x; t, A) where the 
operator B works on (7, x). Using the lemma of §2, we readily see that the 
operators A and B are permutable. 

Now obviously P(r, x; t, A)=Q(r, x; t, A) and P(r, x; t, A) 
=Q(7, x; t, A). Assuming, then, (49) to be true for some 21, we get 
= APO = =A = BA PO-) = BP = BOM = 

(ii) To prove the second part we use the representation (39)-(41). Then 


5(x, A) = Q(z, t, A) 


t—T 


— pit, *)8(x, A), 


obviously, as rt —0. Moreover 


WILLY FELLER 


> (r, x; t, E) s1- Q(r, x;t, E) 


and thus 


1 
(50) lim sup Dd x; t, E) S p(t, x). 


| 
From (40) we get however 


1 
lim inf Dd Q(r, x; t, A) = lim inf Q(r, x; t, A) 
(51) mt T mt 
= p(t, x)II(t, x, A). 


Applying now (51) both for A and E—A, we get by (50) 


lim > (r, x; t, A) = p(t, x)IT(t, x, A) 


n=l 
which proves the theorem. 


THEOREM 5. For r<)<t one has identically 
(52) Q™(r, x; t, A) = x; d, 9; A) 
E 


where Q(r, x; t, A) was defined by (39)-(40). 
The solution P(r, x; t, A) of Theorems 1 and 2 satisfies the equation (3) of 
Chapman-Kolmogoroff (#*). 


Proof. The second part of the theorem is an immediate consequence of the 
first part. 

Equation (52) is trivial for »=0. Assuming it to be true for some 20, 
we get by (39)-(40) 


nt+1 


Q(r, y; t, A) 
k=O 


= exp {- a; t, A) 


+¥ 2) exp {- p(s, ha 


f II(o, x, 2; A, dE,)Q+!-(y, A) 
Ey 


(#8) It should be observed that Theorem 5 is valid even in cases where P(r, x; t, A) is nota 
proper probability distribution, i.e., where P(r, x; t, E) <1. 


504 [November 
nal 


1940] DISCONTINUOUS MARKOFF PROCESSES 


= exp {- f a; t, A) 


2) ex {- ash do 2, 5:4, 8) 


p(s, arash t, A) + x; t, A) 


= 2; A). 


THEOREM 6. If there is some a>1 and a function w(t) e L* such that uni- 
formly 


p(t, < 


then the solution P(r, x; t, A) of Theorems 1 and 2 is a probability distribution, 
(1) holds. 


Proof. With the notation of the proof of Theorem 1 we have by (35) and 
(31) for any n 


L(r,*, i) f ‘de f y) P(r, x; 0, dEy) f P(r, 0, E)do 


and since 0S P™ (7, x; 0, E) S1 it follows that 
t 
(53) f P(r, 2; 0, E)do = h(r, t)L(r, x, 


where h(r, t) >0 is independent of m. But the left-hand member in (53) is the 
general term of a convergent series, and therefore L(r, x, t)=0. This proves 
the proposition in view of the corollary to Theorem 1. 

6. The temporally homogeneous process. So far it has been shown that 
there is always a function P(r, x; t, A) satisfying all requirements of the 
theory except, perhaps, (1). That (1) does not necessarily hold will be shown 
by means of a simple example in §7. This is a surprising result and requires a 
better understanding of the mechanism of the process. We shall confine our- 
selves to the temporally homogeneous processes; but, at least as far as suffi- 
ciency is concerned, the condition of the following theorem can easily be 
extended to some more general cases. 

We begin with some preliminary remarks and notations. In the case of 


506 WILLY FELLER [November 
p(t, x) and I(t, x, A) not depending on ¢ the solution P(r, x; ¢, A) of. Theorems 
1 and 2 obviously depends only on t—7 and we write 

P(r, x; t, A) = P(t — 1, x, A). 
Similarly we write 
(54) p(t, x) = p(x), T(t, x, A) = I(x, A). 


II(x, A) defines in the usual way an ordinary Markoff chain, that is to say, 
a sequence of probability distributions defined by 


T1(x, A) = 6(z, A), 
(55) 


M(x, A) = f dB,), (n= 2). 
E 


Obviously this chain is closely related to our stochastic process, and in par- 
ticular the ergodic properties of the original process will be regulated by the 
ergodic properties of the chain (55). Roughly speaking, II™(x, A) gives the 
conditional probability distribution of the state X(¢) under the assumption 
that X(0)=x and that a change of state occurred during (0, ¢) exactly n 
times—the time of occurrence of these jumps being left out of account. 

If A and {2 are any two sets of %, we put 


Ig (x, A) = 8(«, AQ), 


(86) Ig’ (x, A) = fu (x, dE,)I(y, A), (n = 1). 
In terms of the Markoff chain (55) 11'S (x, A)i is the probability that the mov- 
ing point, starting from x e 2 would remain in © for the n—1 first t steps and 
would be taken into some point of A by the nth step. Obviously n% ) (x, A) =0 
for all sets A if x is not contained in Q. For fixed x and Q the sequence 
IIo (x, 2) is never increasing: II'o (x, 2) | a. If a>0, there is a positive prob- 
ability of never leaving the set 0, if we have started from the point x e 2. For 
further application we note that for x e 2 we have 


(57) s(x, E — 2) + 2) = 1. 


Finally we introduce the notation 

(58) 2, = {p(x) > 0}, 

that is to say, 2, consists of those points x e 2 for which p(x) <0. 
THEOREM 7. Suppose that p(t, x) and II(t, x, A) are of the form (54). 


? 


1940] DISCONTINUOUS MARKOFF PROCESSES 507 


(i) In order that the solution P(r, x; t, A) of Theorems 1 and 2 satisfy (1) it 
is necessary and sufficient that whenever for some point x and some set D&B 
with Q=Q, the inequality 


(59) (x, 2) >a>0 


holds for all n, then the series 


1 
(60) = f Tg’ (x, dEy) 
P(y) 
diverges('*), 
(ii) In this statement the series (60) can be replaced by 
1 
(61) | (x, dE,). 
p(y) 
CoroLiary. In order that P(t, x, E)=1 it is necessary that for any point 
xe E, the series 


1 
f —— dE,) 
diverges. (This condition is, however, not sufficient, as will be shown by an ex- 
ample in §7.) 


Proof. Condition (ii) is stronger than (i). We have, therefore, to prove that 
the divergence of (61) is a sufficient, the divergence of (60) a necessary, condi- 
tion. 

(i) Sufficiency. This part of the proof will rest mainly on the representa- 
tion of P(t, x, A) given by Theorem 2. 

In the case of a temporally homogeneous process the function L(r, x, t) 
defined by (31) depends only on x and t—7, and we write 


L™ x, 1) = — 7, x), 
so that for t>0 


(62) LO, 2) = J dE,). 


Now, using the notation (54), we get from (42) 


(63) POU, 2, + POLo, = A), 


(#%) In the sense that the series is to be considered as divergent if some of the integrals in 
(60) are divergent. 


508 WILLY FELLER 
and for n21 
t 
P(t, A) + p(x) f P(g, x, A)do 
(64) 
= f do (2, y, A). 
0 gE 


Combining (64) with (62) we get, for »21, 
(65) f P(t, x, dEy) + p(x) L(t, x) = p(x) f Lo-Y(t, y) I(x, dE,). 
E E 


Now, for all points x e E— E, (that is to say, if p(x) =0) we have L(t, x) =0 
for all ». Hence, using an inductive argument, it readily follows from (65) 
and (32) that for x e E, and any n20 


(66) L(t, x) = > az,) f y, dE,). 
Ey 


By P(Y) 

Integrating (66) we get 

n 1 t 

> f L"-»)(t, (x, dE,) = f {1 — L(o, x)}do < t, 

By 0 
and since by (35) L(t, y) | L(t, y), it follows that for x e E, 

» 1 

(67) —— Lit, (x, dEy) S t. 
By 


Suppose now that there is some point xo and some ¢ such that P(t, xo, E) 
<1. By the corollary to Theorem 1 this implies that 


(68) 1 > Lit, x) = a>0. 

Denote, then, by © the set of all points x with 

(69) Lit, x) 2 a. 

Obviously 2=2,, since p(x) =0 implies L(t, x) =0. Now by (65) we have 
L(t,x)S | L(t, y)M(x, 


Ey 
and consequently 


a = Lit, %) f L(t, dEy) S all(x%, E — 2) + f L(t, dEy), 
Et Q 


= (xo, — 9) + f L(t, (x, dE,) 
2 


if 
i 


1940] DISCONTINUOUS MARKOFF PROCESSES 


so that by an inductive argument, using (56) we have 
(70) a Sad — 9) + f L(t, (xo, dy). 
k= 2 


Here the sign of equality can hold only if 19 (xo, E—Q) =0 for k=0,---,n. 
Since L(t, y) $1 we get from (70) using (57) 


(71) aSa(l—)+, 
where 7 is defined by 


Ig (xo, 2) 9. 


Again, in (71) the sign of equality can hold only if 119 (xo, E—@) =0 for all k; 
but by (57) we have in this case 7=1. Otherwise a<a(1—7)+7 so that cer- 
tainly 7>0. Thus 2=(, is a set with the property stated in the theorem. 
However, by (67), (68) and (69) we get 


1 1 
mora p(y). moda p(y) 
1 
dE,) < t, 
mo By P(Y) 
which means that the series (61) is convergent. Thus the divergence of the 
series implies a=0 o1 P(t, xo, E) =1. 
(ii) Necessity. This part of the proof will mainly use the representation of 
P(t, x, A) given in Theorem 1. 
Suppose that condition (i) of the theorem does not hold, that is to say, 
that there is a set Q= 2, for which (59) holds for some fixed x» e 2 and for 
which 


= 1 
(72) Ilg dEy) =a < 


(72) implies in particular that all the integrals occurring converge. For this 
fixed set 2 and all points x e 2 we define, in analogy to (27), an additive func- 
tion Po(t, x, A) of sets A e 2 by the recurrence formula 


Po (t, x, A) = P(t, x, A), 


(73) x, A) = p(y) — AQ)Ps x, dEy), 
0 


where 


II*(t, x, A) = exp { —tp(y) } dE,). 


510 WILLY FELLER 
Obviously 


(74) 0 < Po'(t, x, A) P'Ut, A), 
and putting 


(75) Wa (t, x, A) = P(t, x, A) — A) 


it is readily seen that both W%(t, x, A) and P®(t, x, A) are non-negative 
completely additive functions of sets A e 8. Furthermore 


(76) P(t, x, A) = > Pa (t, %, A) + > x, A). 
n=O n=O 


P(t, x, A) can be interpreted as the compound probability that the state 
X(t), starting the point x e Q at t=0, will during the time ¢ change by exactly 
n jumps and in such a way that it remains contained in 2 during the whole 
time and is contained in A at the moment ¢. Of course P(t, x, E—-Q) =0 for 
xe, 

Now (33) reads in our present notation 


A) + f f P(g, x, dEy) 
(77) 0 A 
= f do f (y)II(y, A)P(o, x, dEy). 
0 


It is easily seen that the same calculations lead, for P+" (t, x, A), to the 
analogous formula (supposing, of course, x e 2) 


Pot” (4, 2, A) + f f p(y) Ps (0, x, dE,) 
(78) 0 A 


p(y)T(y, x, dE,). 
0 


Subtracting (78) from (77), we get by (75) 


t 
wert, A d we x, dE, 


J | as f A — AQ)Pa’(o, x, dEy) 
0 


+ J J A)Wa' (0, x, dE,). 


Now for any x e Qand any A 


| 


DISCONTINUOUS MARKOFF PROCESSES 


(0) 1 


and from (78) we get 


and thus by induction 


(80) f Po’ (o, x, dEy)do (x, AQ) 


or 


1 


(the convergence of the right-hand member being guaranteed by (72)). 
It follows from (81) and (72) that 


t 
0 
and since P™ (¢, xo, E—{) =0 we can also write 


(82) xo, E)do S a, 
0 


Next, we deduce a limitation for W%(t, xo, E). Putting for n21 


a, = f do E- (0, Xo, dE,) 


we readily get by (80) 
5 f (se, dE,) = Ts" 9). 
Hence, by (57) and (59), 
La 
By definition W%)(t, xo, E)=0. Hence we obtain from (79) 


> Ws (t, x0, E) = > — f Ws (0, xo, 
n=O n=l 0 2 


1940] 511 
e 


WILLY FELLER [November 


or 


(83) > Wa'(t, E) S$ 1—a. 
n=O 


Combining (83) with (82) we get finally, using (76), 
xo, E)\do S a+ (1 — adé 
or, for ¢ sufficiently large, 
P(e, xo, E)do < t. 


It follows that P(t, x9, E)#1 which proves the necessity of our condition. 

A few words may be added about the meaning of the conditions of Theo- 
rem 7. Suppose that there is a set 2= Q, such that II (xo, 2) >7>0 for some 
xo © ©, and such that (61) converges. Consider, then, a random point moving 
at given moments in E by jumps according to the probability laws expressed 
by the ordinary Markoff chain (55). It is obvious that, if the point remained 
in Q during the first » steps, the probability of never leaving 2 tends‘to 1. 
Thus, for any e>0, there are points x e for which II (x, 2)>1—e. The 
proof given for the sufficiency of our condition shows that also for all these 
points the series (61) will converge. Thus, in the statement of the Theorem 7 
a can be replaced by 1 —e. 

Denote now, as before, by A, the set of points x, with p(x) Sa. The con- 
vergence of (61) implies the convergence of 


> Aa®) 


for any fixed a, and accordingly there is some sequence a, » such that even 


(x, Aa,2) 

n=0 
converges. This means, however, that there is a probability 7 >0 for our mov- 
ing point to be contained for all m after m steps in Q—QA,, that is to say, in 
a part of Q, where p(x) >a,1 ©. In other words, there is a positive probabil- ° 
ity that our moving point will move, in the mean, towards points with in- 
creasing p(x); and if it did so for the first m steps, the probability that it will 
continue tends to unity as n—> ©. Thus, in terms of the ergodic properties of 
the Markoff chain (55), the convergence of (61) is only possible if the point x 
is contained in a dissipative part of E. 

Now the same reasoning applies also to the change of the state X(t) under 


512 
= 


1940) DISCONTINUOUS MARKOFF PROCESSES 513 


the influence of our stochastic process (cf. the interpretation of (55) on page 
506). Roughly speaking, if P(t, x, EZ) <1, the difference 1—P(t, x, E) can be 
interpreted as the probability that the state X will, starting from the point x, 
change during the time ¢ by infinitely many jumps. The mth jump takes X 
in the set 2, ¢ 2 and 2,0. It follows from Theorem 7 in particular that we 
have P(t, x, E)=1 for any point x belonging to an ergodic part of E—that 
is to say, if there is some bounded set A such that 


lim — > II)(x, A) > 0. 
n=l 


7. Examples. 
(i) Consider the case of an enumerable E, with the points xo, %1,---, 


and of a temporally homogeneous process. Let the p; be any given positive 
constants and 


for k= i+ 1, 
= 
0 for k#i+1. 


That is to say, from x; only a transition to x; is possible, and the probability 
of such a transition during an interval of length At is p,At+-o(At). 
The differential equations (13) of the process take on the form 


(84) Pit) = + 
so that 


(85) = 0 for k < 1, Pii(t) = rit 


and 


t 
Pix(t) = f exp{ — pi} Pin1(0)do for k >i. 
0 


In the case that p;+ p; for 1k, the explicit solution is for k >i 
> 
(Pe — Pi) (by — Piss) (Dy — — Post) (Pe — Pn) 


it can be verified by means of the Lagrange interpolation formula but is of 
little use. The solution in the case in which p;¥ p; for 1#k is not necessarily 
true, follows by the usual passage to the limit. We have 
= 
1 for k=i+n 


514 WILLY FELLER [November 


and thus, by Theorem 7, the necessary and sufficient condition for > .P u(t) =1 
is that > n.91/pi+n diverges(2). This can also be easily verified directly. Putting 


(86) Li(t) = f 
0 


it follows from (84) for k>7 that 
(87) Pin(t) + Lin(t) = Li,n-1() 


while P;,(t)+Li(t) =1 and thus L;,i+n(¢) | Li(t) as n— ©. On the other hand 
we have by (87) 


itn 


Pix(t) =1- 


or 
(88) > = 


But by (86) and (88), L,(t) >0 would imply 


(89) ‘2 Pule)de = 
kno 0 ket Pk 


that is to say, the convergence of }>41/pi4:. Conversely, by (89) the diver- 
gence of implies that L,(¢) =0, or by (88) that }>,Pu(t) =1. A simi- 
lar argument can be applied even in the case that the p; depend on t. 

The stochastic process just described plays an important role for different 
applications. In the case that all p; are equal, p; = p, it reduces to the classical 
Poisson process 


(pt)*-* 
(k — i)! 


The general case was used by Lundberg [7] in the theory of invalidity in- 
surance, and by Feller [4] to describe the growth of some biological popula- 
tions. In both cases it is natural to assume that p;— © as io. The same 
stochastic process was also applied to describe radioactive processes, x; stand- 
ing for the “elementary probability” of its disintegration; but here, of course, 
the space E contains only a finite number of points (or, what amounts to the * 
same, some 7, =0). 

(ii) Finally we give an example which proves that the condition of the 
corollary to Theorem 7 is not sufficient. 

Let E consist of the points x; i1=0, +1, +2, +3,---. The process is 


Pa(t) = (k = i). 


(°) The vanishing of any particular p, obviously implies that P,,(¢) =0 for any couple (é, k) 
with isn<k; and it is readily seen that > =1 for any 


4 
/ 
k=O 
4 


1940] DISCONTINUOUS MARKOFF PROCESSES 515 


again temporally homogeneous. For i <0 only the transition is pos- 
sible; for 120 both x; and are possible, the corresponding 
probabilities being 1—72;>0 and 7;>0. In other words we suppose that 
1 ifi<0,k =i-—1, 
1—n7,ifi20, k=i+1, 
mifi2z0, k= —i-—1, 


0 otherwise. 


= 


Let us now suppose that (i) the product [](1—7,) =a>0, (ii) p;=1 for i<0, 
and (iii) }>,2,1/p;=a converges. Then the condition of the corollary to Theo- 
rem 7 is satisfied. For obviously we have if n>0, 720, 


and if i<0 


= 1. 


Hence 


1 n (n 


diverges. But, taking for © the set of all points x; with 120, it is readily seen 


that (59) holds for ariy x =x;, i120, and nevertheless the series converges. 


REFERENCES 


1. J. L. Doob, Stochastic processes depending on a continuous parameter, these Transac- 
tions, vol. 42 (1937), p. 107. 

2. W. Dubrovski, Eine Verallgemeinerung der Theorie der rein unstetigen stochastischen 
Prozesse von W. Feller, Comptes Rendus (Doklady) de l’Académie des Sciences de 1’URSS, 
vol. 19 (1938), p. 439. 

3. W. Feller, Zur Theorie der stochastischen Prozesse (Existenz- und Eindeutigheitssdtze), 
Mathematische Annalen, vol. 113 (1936), p. 113. 

4, , Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahr- 
scheinlichkeitstheoretischer Behandlung, Acta Biotheoretica, vol. 5 (1939), p. 11. 

5. M. Fréchet, Recherches Théoriques Modernes sur le Calcul des Probabilités, Part II 
(Traité du Calcul des Probabilités, vol. 1, no. 3), 1938. 

6. A. Kolmogoroff, Ueber die analytischen Methoden in der Wahrscheinlichkeitsrechnung, 
Mathematische Annalen, vol. 104 (1931), p. 415. 

7. O. Lundberg, forthcoming dissertation, Stockholm. 

8. G. Pélya, Sur la promenade au hasard dans un réseau des rues, Lecture at the “Colloque 
Consacré ala Théorie des Probabilités,"’ Geneva, 1937, Actualités Scientifiques et Industrielles, 
no. 734, 1938, p. 25. 

9. Added in proof; cf. the footnote on page 492: W. Doeblin, Sur certains mouvements 
aléatoires discontinus, Skandinavisk Aktuarietidskrift, 1939, p. 211. 


Brown UNIVERSITY, 
ProvipENce, R. I. 


ON LINEAR TRANSFORMATIONS 


BY 
R. S. PHILLIPS 


The purpose of this paper is to give a characterization of linear and com- 
pletely continuous transformations both on the common Banach spaces to an 
arbitrary Banach space and vice versa. There is an abundant literature on 
this subject. Among the earliest papers, the now famous paper of Radon [24] 
should be mentioned. Here linear transformations on L? to L* (1<p, g< ©) 
are characterized in a manner suggestive of the methods used in the present 
paper. The works of Gelfand [12], Dunford [6], Kantorovitch and Vulich 
[17], and Dunford and Pettis [9] contain much material on this subject 
supplementary to that treated here. In the interest of completeness we have 
restated a few of the results obtained by Gelfand [12], and Gowurin [13]. 

The principal tools used in our characterizations are certain abstractly 
valued function spaces. One such space is the class of all additive set 
functions x(r) on all Lebesgue measurable subsets 7 of (0, 1) to a Banach 
space X where for all linear functionals # on X and for all subdivisions 
mw=(t1, T2,°°*,Tn,***) Of (0, 1) into disjoint sets, 


LU.B. | < 


| 


If p(t) e L? (1/p+1/q=1), we define an integral /¢dx to be the generalized 
x-limit of the unconditionally convergent sums where ¢; 7;. The 
function U(¢) = /¢dx so defined on L? is a characterization of the general 
linear transformation on L? to X. 

The first section is a study of the abstractly valued function spaces which 
will be used to characterize the transformations. Section 2 is devoted to a 
discussion of three different types of integrals needed in these characteriza- 
tions. In §3 a necessary and sufficient condition for a subset of a Banach space 
Y to be conditionally compact is given in terms of an arbitrary determining 
manifold I in the conjugate space Y. As a consequence, if a transformation U 
is additive and homogeneous on X to Y and its adjoint is completely continu- 
ous on I’ to X, then U is completely continuous on X to Y. The section also . 
contains a characterization of conditionally compact sets in a Banach space 
by means of a generalized base. This is applied to the spaces L? (1S ps ~) 
in §§5 and 6. Section 4 contains the principal results of this paper, namely, 
a characterization for the classes of transformations considered. In §5 we 


Presented to the Society, December 29, 1939; received by the editors March 9, 1940. 
This paper was received by the editors of the Annals of Mathematics November 18, 1939, 
accepted by them, and later transferred to these Transactions. 


516 


& 
+ 


LINEAR TRANSFORMATIONS 517 


obtain representations by means of a kernel of the general completely con- 
tinuous transformation and weakly completely continuous separable trans- 
forination on L to an arbitrary Banach space. By means of this result and a 
theorem due to Dunford and Pettis [9], we show that U? is completely 
continuous whenever U on L to L is weakly completely continuous. As a 
further application of this work, we prove in §6 that completely continuous 
transformations on the spaces L?, 1?, C, co, Mr (1S pS @) to an arbitrary 
Banach space are approximable in the norm by degenerate transformations. 
A final section is devoted to the extension of linear transformations. 

We will consider an abstract class T of elements ¢ possessing a sigma 
family G of subsets r of 7. a(r) will be a single-valued, non-negative, com- 
pletely additive measure function on ©, which need not be finite valued. It 
will be convenient to designate by |r| the value a(r). X will denote a Banach 
space of elements x and X its conjugate space of elements # [1, chap. 5]. We 
define with Dunford [7, p. 316] a determining manifold I’ in ¥ to be a closed 
linear subset of X¥ such that(') L.U.B. [| #(x)| |x e X, ze I, $1] 
Mr will be the Banach space of bounded functions a(#) on an abstract class 
T = [t] to real numbers having the norm |la|| =L.U.B. [|a(¢)| |¢ e 7]. = will 
have three different meanings: type 1, 7 will be a finite or denumerable set 
of disjoint sets of G such that 0<|7| < ©. #1272 will mean that or! > 
and that every set r' € 7; is either a subset of some 7? € 72 or rT! is disjoint from 
every T? © m2; type 2, 7 will be a subdivision of T into a finite number of dis- 
joint sets r e © (G need not possess a measure function). 7; 272 will mean that 
each r! € 7; is a subset of some 7? € 72; and type 3, z will be a subdivision of the 
interval (0, 1) into a finite number of intervals the maximum of whose lengths 
is |x|. 7122 will mean that |7:| <|72|. In each case the relation 2 on the 
class [x] is transitive and compositive. The general limit of E. H. Moore- 
H. L. Smith [20, p. 103] can therefore be defined on each of these ranges. 
Lim, will designate this limit. 

1. Abstractly valued function spaces. We will be interested in the follow- 
ing classes of functions: 


V\(X,T) = | L.U.B. >> < ©, r| 
| #[x(r.)]|¢ 


= [x(r) | M-|r], |r| < 


< ser}, 


Eo | L.U.B. >> 


v(X,T) = | #(xn)|* < #e 


v°(X) = | L.U.B. < | 


(‘) The class of elements s satisfying the property P will be designated by [s| P]. 


518 R. S. PHILLIPS [November 


For g#1 (q=1) m is always to be understood as being of type 1 (type 2). 
For g=1, G may be a finitely additive Jordan field. It will be convenient to 
denote an element of one of these function classes by %. 

If T is the set of positive integers, © the family of all subsets of 7, and 
if |r| is equal to the number of integers in 7, then V*(X, T') (1<q<@) and 
V*(X) are identical with v*(X, I’) and v*(X) respectively. Theorems analo- 
esi to Theorems 1.1, 1.2, and 1.5 have been proved for v'(X, I’) by Gelfand 

12]. 


1.1. THEOREM. Jfze V(X,T),15q< ©, then there exists an M such that(?) 


L.U.B. [ | | < 


|r 


foralléeT. 
Define 


> | #[x(rs)] 


| 


= LU.B. [ 2 


on I. It is easy to show that 20, p(4:+ 42) (41) + p( 42), and that 
implies lim inf,.. p(4.) 2 p(#). The theorem now follows from a lemma due to 
Gelfand [12, p. 240]. 

We define a norm for the several spaces as follows: 


||<|| = L.U.B. > | £e rt, 


eV(X,T), 


|| = L.U.B. red, | r| < xe V(X), 
T 


= L.U.B. [ ale.) |, r|, £eo'(X, 


1.2. THEOREM. V*(X,T), 1 ©, is a Banach space. 


It is clear that the spaces are linear normed spaces. Only the proof of com- 
pleteness remains. Suppose {%,} isa Cauchy sequence in V*(X, T), 1S5q<. 
Then for every 7 e ©, 

— Xm 


m, no | r| ql 


uniformly in the unit sphere of Hence lim, —x,(r)|| = 0 and there 
exists an additive set function x(r) =lim,... X(7). Further, if we are given # in 


(?) For g=1, we define | z| ¢-* to be identically one. 


i 
. 
| 
a 


1940] LINEAR TRANSFORMATIONS 


the unit sphere of I’, 7, and N, then 


| |“ 


| 


lim | 


so that te V*(X,T). Finally for an arbitrary e>0 there exists N, such that if 
m, n2N, then ||%n—2n|| Se. Therefore if n>N. 


| Ti |e-1 


Completeness for V“(X) can be demonstrated in a similar fashion. 

If X is the space of real numbers R, then I’ must be identical with X¥ = R. 
For < it is well known that V*(R) (1 ©) is equivalent 
[1, p. 180] to the space L*(a) of measurable functions ¥(t) for which 
JSr|W(t)| (15q< and ess. L.U.B. [|y(4)||te T]< (g= The 
is defined by the transformation U(W) =x(r)=J/»(t)da for 

<0, 

A sum )>x, will be said to be unconditionally convergent if }-x, summed 
over any subsequence of the integers converges. x(r) will be called completely 
additive if for any sequence of disjoint measurable sets {7,}, 
where the sum is unconditionally convergent. 


1.3. THEOREM. Jf 2 V9(X,T) (1<qS @) and if rots of finite measure, then 
x(r) is an absolutely continuous and completely additive set function on measur- 
able subsets of ro. 


If T), then 


| 


for all eT’ and all r (0<|7| < ©). Hence ||x(r)|| S||2|| - which implies 
absolute continuity. Let us now consider x(r) and V*(R) on subsets of a set To 
of finite measure. Since #[x(r) ] e V*(R) it follows from the above that there 
exists y(t) e L*(a) for which #[x(r) ]=/,(t)da. Given a sequence {ra} of 
disjoint measurable sets, then 


By a theorem due to Dunford [7, p. 326, Theorem 32], x(S>7.) =).x(T2) 
which is unconditionally convergent. 

It is clear that the transformation U(z)=2[x(r)] on T to V*(R) 
(19 @) is linear and that || U|| =||x||. We define V*.(X, I’) to be the sub- 
space of V*(X, I) for which this transformation is completely continuous. 
Because the class of completely continuous transformations on X to Y is a 


{ 
i 


520 R. S. PHILLIPS [November 


closed linear subspace of the space of linear transformations, it follows that 
the same is true of V%.(X, I’) in V*(X, TI’). We define T) @) 
in a similar fashion. 


1.4. THEOREM. A necessary and sufficient condition that x belong to V.(X) 
is that % belong to V*(X) and that the set [x(r)/|r| |r © G] be conditionally com- 
paci. 


This is an immediate consequence of Theorem 3.1. 


1.5. THEOREM. A necessary and sufficient condition that % belong to 
V1.(X,T) is that belong to and that the set [x(r) | be condition- 
ally compact. 


This again follows from Theorem 3.1. 


1.6. THEOREM. If T=(0, 1) and a is the Lebesgue measure function, then 
for 1<q<~©@ the following are equivalent statements: 

(1) V*.(X, T). 

(2) e V(X, T) and 


1 
dt 
t+h 


h-0 0 dt 


uniformly for all in the unit sphere of T (0, t)). 


This follows from the above remarks on the equivalence between V*(R) 
and L¢ and a theorem due to M. Riesz on compact sets in L¢ [25]. 


1.7. THEOREM. A necessary and sufficient condition that x belong to v4.(X, T) 
for 1Sq<@ is that belong to T) and that limn.«. | (xs) | *=0 uni- 
formly in the unit sphere of T. 


This follows from a well known theorem on compact sets in /*, which for 
q=2 is due to Fréchet [11, p. 19]. 

If «ev(X, then where » runs 
through all finite sets of integers. If « e v'.(X, I), we have as a corollary 
to Theorem 1.7 that }>x, is unconditionally convergent. Dunford has shown 
that if * is unconditionally convergent, then # e v'.(X, I’) [7, p. 326, Theo- 
rem 32]. 


1.8. THEoREM. If X is separable and if & e v'(X, T), then # e v'.(X, X). 


By hypothesis >>| #.(x) | 7 +| -||x|| for every x e X. Hence for every de- 
numerable set of integers .%n(x)| - ‘fell and is therefore a linear 
functional #, on X. Since X is a determining manifold in X, it follows by the 
above mentioned Dunford theorem that }>x, is unconditionally convergent. 
In other words # € v'.(X, X). 


. 
— 


1940] LINEAR TRANSFORMATIONS 521 


1.9. Corottary. If X is separable and there exists an x e v'\(X, X) which 
is not an element of v'.(X, X), then X and any separable space containing X 
as a subspace are not conjugate spaces. 


This permits another demonstration of the fact that c and hence C which 
contains c is not a conjugate space. Let x, be the mth unit vector in c. Any 
#={a,} e/ [1, p. 67]. #(xn)| =| an| < © implies that # e v'(c, 1). How- 
ever )_x, is obviously not unconditionally convergent. 

2. Integrals. It is convenient to divide the discussion of this section into 
three parts: (1) an integral involving functions ¢(¢) e L?(a) and # e V*(X, T) 
where 1/p+1/q=1 and 1<qS~; (2) an integral involving functions $(t) 
either bounded or a-measurable and essentially bounded and ¢ e V4(X,T); 
and (3) an integral involving functions ¢(¢) e C and functions x(¢) to be de- 
fined. \ 


2.1. DEFINITION. For functions $(t) e L?(a) andze V(X,T),1<qs we 
define 


f = lim (x of type 1) 


whenever for some mo and all r=10, > x(ts)x(11) is unconditionally convergent 
for each t; e +; and the limit exists. 


The multiple valued function x(7) on a transitive and compositive class 
[x] will be said to be a fundamental x-sequence if for an arbitrary e>0 there 
exists a 7, such that for m1, 7227, ||x(a1) Se(*). 


2.2. Lemma. If [x(xr)] is a fundamental r-sequence and U a linear trans- 
formation on X to Y, then there exists an x e X such that x=lim, x(r) and 
U(x) =lim, U[x(x) ](4. 


Choose a sequence of positive numbers e,—0. It is clear that one can ob- 
tain a sequence 7, such that 7,4,:27, and for r27, ||c(ar) —x(r,)|| Se,. Let 
x’(a,) be one of the elements of x(z,,). As X is sequentially complete there will 
exist an x e X such that x=lim, x’(z,). But then if r27,, \|x —2x(ar)|| S2¢e, 
and likewise U(x) — U[x(x) ]|| U||-2e,. Hence x=lim, x(r) and U(x) 
=lim, U[x(x) |. 


2.3. THEOREM. If $(t) e L®(a) and te V*(X,T), then exists. 


Given e>0, there exists 7, such that if then < 


and Se for all 4; e7;e and all fe rhe where 
7, ¢7}. Then 


(*) If Bisa subset of X, ||B|| =L.U.B. [||x|||xe B]. 
(*) Compare with Moore and Smith (20, p. 106]. 


4 
i 


R. S. PHILLIPS [November 


and therefore approaches zero uniformly in the unit sphere of T as n# o. 
It follows that >>.@(t;)x(7;) is unconditionally convergent. Further if 27, 


DL — H(t, = LU.B. (6(4) — 


er, 


Hence }>,¢(t;)x(r;) is a fundamental w-sequence and by Lemma 2.2 
> = Jodx exists. 
2.4. THEOREM. L.U.B. [|| | e L7(a), ||¢|| =1] 
Since V*(R) will be used to indicate the function #[x(r) ] e V*(R)) 
L.U.B. | f odx| = L.U.B. | f ¢d%(x) | = ||2(%)||. 


lol =1 


Therefore 


LLU. | f | sae eT, = = 1] 


= LU.B. [||2(4)|| | = 1] = 


One can likewise define this integral on any measurable set. We designate 
the so-defined integral on r by /,¢dx. f,¢dx is clearly an additive set function 
on G. Since by Theorem 2.4 || - | it follows that it is 
absolutely continuous and consequently completely additive. 


2.5. THEOREM. If $(t) e L*(a), a(T)<~, and e V*(X, T) is such that 
x(r) y(t)da, then fodx=f'pyda where are both either Dunford integrals 
with y(t) e [7] or Birkhoff integrals [3]. 


Suppose that y(t) e 4(E)[X, 1]; then #(x(r)) = f-#(y)da for every 4 
It follows from the similar theorem in real variables that 


This is equivalent to the statement that $(t)-y(t) e.C)(E)[X, T] and 
Sodx = {'pyda. 

On the other hand, suppose y(t) is Birkhoff integrable to the value x(r). 
Then let ¢,(¢) =(t) if | Sn and vanish elsewhere. ¢,(t)y(t) differs from 


522 


1940] LINEAR TRANSFORMATIONS 523 


¢(y)t(y) on a set whose measure approaches zero as n— ©. Since ¢,(t) is 
bounded, ¢,(#) -y(¢) is Birkhoff integrable [3, p. 369, Theorem 17]. Moreover 
it is integrable to the same value as the above Dunford integral so that 
Sbnyda= f,p,dx. By Theorem 2.4, 


| f |- J sllal-[ fl 


The integrals [,¢nyda are therefore uniformly absolutely continuous. By a 
theorem due to the author [23, Theorem 6.2] $(¢)-y(¢) is Birkhoff integrable 


and 
f gyda = lim ff = ff 


We will next consider G-measurable functions $(¢) either bounded or es- 
sentially bounded relative to a measure a. x(r) e V'(X, I) is defined on all 
7 e G and in the latter case vanishes on the null sets of a. For convenience 
we will limit ourselves to the former case. 


2.6. DEFINITION. For a bounded function (t) and x e V1(X,T), we define 


f = lim (x of type 2) 


whenever for t; an arbitrary element of 7; this limit exists. 


When /¢dx exists by both Definition 2.1 and Definition 2.6, the value in 
each case is the same. 

The following two theorems are special cases of a theorem due to Gowurin 
[13, pp. 265-266]. We omit their proofs. 


2.7. THEOREM. If $(t) is bounded and V4(X,T), then exists. 
2.8. THEOREM. L.U.B. [|| | o(¢)| $1] 


It is unlikely that much can be said about the differentiation of 
&e V(X, X) for 1Sq< ©. Pettis has constructed an e V*,(L*, L*) [22, 
Example 9.4] which has no pseudo-derivative [22, p. 300]. In §5 we demon- 
strate that e V".(X) (X arbitrary) and e V*(X) (X separable and regu- 
lar) for T=Do7; (|71| < ©) can be expressed as the Birkhoff integral of a 
function on T to X. 

We wish finally to consider an integral for functions ¢(t) e C. In this 
connection Gelfand [12, pp. 246-253] has introduced the abstractly valued 
function classes V(X) and V.(X). V(X) is the class of all functions x(¢) on 
(0, 1) to X for which #[x(t)] is of bounded variation and continuous on the 
left, while V.(X) is the subclass of V(X) for which the set 


524 R. S. PHILLIPS [November 
[>> (x(t) — x(t/)) | (ts, #/) disjoint intervals] 


is compact. The L.U.B. [variation of £[x(¢) ]|||#|] =1] exists and can be de- 
fined to be the norm ||#|| for elements of V(X). It is easily shown that V(X) 
is a Banach space having V.(X) as a closed linear subspace. 


2.9. DEFINITION. For functions $(t) e C and & e V(X) we define 


f = lim — of type 3) 


whenever for t{ an arbitrary element of (t;, ts-1) the limit exists. 


2.10. THEorem. If $(t) e C and V(X), then exists and L.U.B. 
[Il | | $1] 


This theorem has likewise been proven by Gowurin [13]. It is clear that 
&[/odx | = {¢d#(x) so that this integral when it exists is equal to the integral 
defined by Gelfand [12, pp. 259-260]. 

3. On conditionally compact sets in a Banach space. In this section we 
will consider two different characterizations of conditionally compact sets in 
a Banach space X. The first is given in terms of a determining manifold I, 
while the second involves the notion of a generalized base. 


3.1. THEOREM. A necessary and sufficient condition that the set S=[x] be 
conditionally compact is that both L.U.B. [| #(x)||x for each 
and U(#)=4(x) on T to Mg(5) be completely continuous. 


Let x, be any denumerable subset of S. Its linear closed extension Y is 
a separable Banach space. Let I; be the set of elements of I’ considered as 
members of the conjugate space of Y. I, is clearly a determining manifold 
in the conjugate space of Y. The unit sphere of the conjugate space of a 
separable Banach space is a compact metric space in its weak topology [1, 
p. 186]. Hence I’; contains a denumerable subset {#,} which is weakly dense 
in I',. The linear transformation V(x) =#,(x) on Y to m defines an equiva- 
lence. It is therefore sufficient to show that the set {£,(x»)} is conditionally 
compact in m. By the diagonal procedure we can obtain an n-subsequence 
£y(xn) such that lim, #,(x,-) exists for every p. Moreover this limit exists 
uniformly in p. For if the contrary were true there would exist a p-subse- 
quence having no subsequence for which the limit existed uniformly. As 
\|z,|| <1 and as U is completely continuous, this p-subsequence would have a 
subsequence p’ for which lim, £# (xn) exists uniformly in p’ which gives a 
contradiction. 

To prove the necessity we notice that the closed linear extension Y of S 
is a separable Banach space. Hence every bounded sequence of functionals 


(®) We remind the reader that Msg is the space of bounded functions £#(x) on S to real num- 
bers. 


1 


1940] LINEAR TRANSFORMATIONS 525 


on Y contains a weakly convergent subsequence [1, p. 123, Theorem 3]. Since 
the subsequence is uniformly bounded in their norms the functions are equi- 
continuous and therefore converge uniformly on a compact set. The same is 
true for every bounded sequence of functionals on X as we are concerned only 
with their values on Y. It follows that U(#) is completely continuous. 

Gelfand has proved the following corollary for the case ! = X [12, p. 268]. 
It should be pointed out that the corollary is not true for non-separable X 
as was stated by Gelfand. In his argument he falsely assumed that the func- 
tionals of a weakly convergent sequence of functionals of a closed linear sub- 
space of X could be extended so that the sequence converged weakly on X 
(see 7.5). 


3.2. COROLLARY. A necessary and sufficient condition for a subset S of a 
separable Banach space X to be conditionally compact is that all weakly conver- 
gent sequences of functionals of T on X converge uniformly on S. 


Every bounded sequence of functionals on X contains a weakly conver- 
gent subsequence [1, p. 123, Theorem 3]. By hypothesis this sequence con- 
verges uniformly on S and hence the transformation U(#) = 4(x) on T to Mg 
is completely continuous. By Theorem 3.1 S is conditionally compact. The 
necessity argument is similar to that used in Theorem 3.1. 

The following lemma will permit us to prove that the corollary can not be 
extended to non-separable spaces even if, in its statement, I’ is replaced by X. 
We now suppose T to be the class of all positive integers ¢ and G the family 
of all subsets 7 of 7. 


3.3. Lemma. If B"(r) are bounded and finitely additive set functions on T to 
real numbers, and tf lim, B"(r) =0 for all r e G, then lim, >: pr(t)| =(), 


Suppose the lemma to be false. Then there exists an e>0 such that 
lim supn >e. Now as n—o. Hence we can choose two 
increasing sequences of integers n;, N; such that DH 2e and 
<e/8. Let us consider for the moment as a pri- 
mary block some subset 7; of N;St<Ni4: for which |6*(7,)| >e/2. Dor; is 
then divided into a denumerable set of disjoint blocks. Since a denumerable 
set has an infinite number of disjoint denumerable subsets and since 6*(7) 
is bounded, there will exist a denumerable subset of blocks 7; such that, on 
any of its subsets 7, | B™(7r)| <e/8. The same argument gives a denumerable 
subset 72 of 7; such that, on any of its subsets 7, | 8"2(ar) | se/8. Likewise we 
can find a denumerable subset 7, of 7,1 such that on any of its subsets 7, 
| B"»(xr)| se/8. Clearly r, ¢ my. Let mo consist of the mth block of 7, for all n. 
If ro contains the block 7;, then there exists g, 2k such that 


526 R. S. PHILLIPS 


Therefore 
| | = | B™*(rx)| — — e/4. 


Since 7» contains a denumerable number of such blocks, B"(2) does not ap- 
proach 0, which is contrary to our hypothesis. 

As Hildebrandt [14] has shown, to every £ e m there corresponds a unique 
additive bounded set function §(r) on all sets of integers such that for all 
xem, &(x) = frx(t)dB. If n(x) = frx(t)dB" converge weakly to zero on m, then 
B*(r)—0 for all r e G. We therefore have the following 


3.4. COROLLARY. If £,(x) = fre(t)dB" converge weakly to zero on m, then 


0 as n—o, 


3.5. Example. Let S be the set of unit vectors x, in m. If , converges 
weakly to %, then §,=%,—4%9 converges weakly to zero. As above 4,(x) 
= frx(t)dB" and 0 as Therefore uni- 
formly in p. In other words, £n(xy)—>£0(x») uniformly in S. 


3.6. THEOREM. If U is additive and homogeneous on X to Y and U is com- 
pletely continuous on T to X, then U is completely continuous on X to Y. 


As Dunford [7, p. 317, Theorem 18] has shown, this hypothesis is suffi- 
cient to make U a linear transformation on X to Y. Let S be the image under 
U of Xi, the unit sphere of X. §(U(x)) is then a linear transformation on T 
to Mx,. Given any sequence {4,} in the unit sphere of I’, there exists a sub- 
sequence n’ such that U(4,’) converges in X. Hence Jn(U(x)) =O (Gn-) (x) 
converges in Mx,. We can now apply Theorem 3.1 with S= U(X). S is con- 
ditionally compact and hence U is completely continuous. © 

We will now give a second characterization of conditionally compact sets 
in a Banach space(*). II will be a general range of elements 7 transitive and 
compositive with respect to the relation 2. U, will be a set of completely 
continuous transformations on X to X defined on II with the properties: 
(1) For every x e X, lim, U,(x) exists and is equal to x. (2) There exists a posi- 
tive number M such that || U,|| <M for all x e II. When the U, are in addition 
degenerate(’), such a class of transformations is called a generalized base of X. 


3.7. THEOREM. Necessary and sufficient conditions that a set Sc X be con- 
ditionally compact are 

(1) L.U.B. 

(2) lim, | U,(x) — =0 uniformly in S. 


If we suppose S to be conditionally compact, then given e>0, there exists 


(*) Dr. T. H. Hildbrandt suggested Theorem 3.7 as a generalization of the author's applica- 
tion of it to L. 

(7) A degenerate transformation on X to Y is a linear transformation on X to a finite di- 
mensional subspace of Y. 


[November 


1940] LINEAR TRANSFORMATIONS 527 


X1, X2,° such that for any x e S there is a k for which <e. 
For the set x, %2,°--, %, there exists a m, such that if r27,, then 
|| Se. Therefore if x e S and r2™,, 


[| — al] — + [| — + — + 2), 


which proves the necessity. 

The sufficiency argument is as follows: Given e>0, there exists 7, such 
that | U,,(x) — x!|| <e/3 for all x e S. As U,, is completely continuous and as 
L.U.B. [| x|| | x e S]< ~, it follows that there exist x1, x2, - - - ,%, ¢ Ssuch that 
for any x e S there is a k for which | U,,(x) — U;,(xs)|| se/3. Therefore 


— < — + — + — ml| 


S is therefore totally bounded or, its equivalent, conditionally compact. 

Theorem 3.7 gives a characterization of conditionally compact sets in X 
which contains as a special case that given by Kolmogoroff [18], Tamarkin 
[26], and Tulajkov [27] for L»(a) where T = (0, 1), ais the Lebesgue measure, 
and 1Sp< o., In this case II is the set of integers and 


n t+1/n 
U,(¢) = — $(s)ds (ge 

2 
For 1<p<o, [U.»() | ||¢l| <1] is uniformly bounded and equi-absolutely 
continuous, and therefore is conditionally compact in L®. For p=1, 
[U.(¢)|||¢|| $1] is of uniform bounded variation, and therefore is condi- 
tionally compact in L. Finally | U,| S1 for 1Sp< ©. The conclusions of 
the theorem are consequently valid. 

Theorem 3.7 can also be applied to the spaces L?(a) (1S pS ~) where T 
is an abstract class of elements. For 1S p< ©, let m be of type 1 and contain 
only a finite number of disjoint measurable sets (71, 72, - : - , Tn). Xr Will denote 
the characteristic function of the set 7. Finally we define U, on L?(a) to 
L*(a) to be 


U,(¢) = 


For p=, let m be of type 2 and contain the disjoint measurable sets 
(r1, T2,° Tn). Then U, on L*(a) to L*(a) will be 


U.(¢) 


where a; is some set of finite measure contained in 7;. The U, clearly define a 
generalized base for L?(a). 

4, Linear transformations. In the following discussion for 1S p< ©, z will 
be of type 1; while for p= ©, w will be of type 2. L? (1S p< @) will be of the 


= 


528 R. S. PHILLIPS [November 


space L?(a). L® will be either (a) the space of bounded G-measurable func- 
tions, or (b) the space of G-measurable functions essentially bounded relative 
to a measure @ with x(r) e V1(X, I’) vanishing on null sets in the latter case. 


1/p+1/q=1. 
4.1. THEOREM. The general form of the linear transformation U(¢) on L® 
~)toX is 
ud) = f 


where V(X,T) and || U|| 


If  e V*(X, T) then it follows from Theorems 2.3, 2.4, 2.7, and 2.8 that 
Jodx is a linear transformation on L? to X with || U|| =||\|. 

To demonstrate the converse, let x,(t) be the characteristic function of 
7 @@ for |r| < ©. We define 


a(r) = U[x,(4)]. 


x(r) is obviously additive on sets 7 of finite measure. For p=1, 


||| = L.v.B. [|2[v(@)]| = | O(a) | Nall = 1, loll = 1] 


| 


= L.U.B. |r| < |. 


Therefore V°(X) and || =||2||. For 1<p< ©, we define, for a given 
and {a,} e when ¢ e 7;. Then 


loll = { f = 


Finally 


LUB. sti x, |lal| = = 


Again e V*(X, and || U|| =||4||. For p= 


4 
' 
ay 


1940] LINEAR TRANSFORMATIONS 


Jol] = Lv.B. | zeT, =1, x, s 


| eer, lal = = 


V(X, T). To each $(¢) e and we associate the multiple valued func- 
tion =¢(7;) for t e 7; where ¢; e r;. Then ¢=lim, ¢, in and by Lemma 
2.2 and Theorems 2.3 and 2.7 


= lim = f ode. 
For p= © Theorem 4.1 has been demonstrated by Gowurin [13, pp. 265- 
266]. 
4.2. Corottary. is equivalent to V(X, X), 


By Theorem 4.1, V*(X, I’) is equivalent to the space of all linear trans- 
formations on L? to X for all T. 


4,3, THEOREM. The general form of the completely continuous transformation 
on L? to X is 


ue) = f 


where V%.(X, and || U|| 
This is an immediate consequence of Theorems 3.6 and 4.1. 
4.4, CoroLiary. V2.(X, is equivalent to V*.(X, X), 


4.5. THEOREM. The general form of the linear transformation U(x) on X to 
V«(R) (1<q< @) where 2,7; <0) is 


U(x) = 
where # X) and || U|| 


It is clear that # e V*(X, X) defines such a transformation and that 
|| Ul] =||4|. Conversely, if U is linear on X to V*(R), then its adjoint U de- 
fines a transformation on V(R) or its equivalent L*(a) to ¥. By Theorem 4.1 


= f eas 


where L*(a), e V*(X, X), and ||D|| =||4||. Since 6[U(x) ] = fodx(2) for 
every e L*(a), it follows that U(x) = [x] and |j Ul] =||4]]. 


(*) Compare with Kantorovitch and Vulich [17, pp. 133-135]. 


530 R. S. PHILLIPS [November 


4.6. THEOREM. The general form of the completely continuous transformation 
U(x) on X to VR) (1<q< ©) where < is 


U(#) = 4(r)[z] 
where V*.(X, X) and || U|| 


The argument used in Theorem 4.5 applies here if we note that J is neces- 
sarily completely continuous [1, p. 101, Theorem 4]. The reference is now 
made to Theorem 4.3 instead of Theorem 4.1. 

If T=(0, 1) and a is the Lebesgue measure, then the general form of the 
linear (or completely continuous) transformation on X to L¢ (1<q< @) is 


U(x) = [a] 


where e V*(X, X) (or V*.(X, X)), || U|| =||4]] and = (0, This is a slightly 
stronger result than that found by Bochner and Taylor [5, pp. 941-944,Theo- 
rems 8.1 and 8.4]. 

A linear transformation on L? to L®’, where a is the Lebesgue measure 
on (0, 1), is characterized by a function K(s, 7) for which 


df! 
¥(s)K(s, Io)ds e Le 


for every y e L*’ (Is =(0, #)). If in addition the transformation is completely 
continuous, then K(s, 7) also satisfies the condition 


uniformly for ally e L’. 

We leave the proof of the following theorems to the reader. Except for 
the space co, the argument is a special case of the above. Gelfand has discussed 
the space c [12, pp. 272-275]. It is convenient to denote the space co by the 
symbol 


4.7, THEOREM. The general form of the linear [or completely continuous | 
transformation U(a) onl? (1S pS ~) to X is 


U(a) = 
where & v(X,T) [or T)] and || U|| 


4.8. CoroLtary. v*(X, [or v%.(X, T)] is equivalent to X) [or 
v.(X, X)] @). 


4.9, CoroLitary. If X is either weakly complete or a separable conjugate 
space, then any linear transformation on co to X is completely continuous. 


| 
| 
41 
a2 
| 


1940] LINEAR TRANSFORMATIONS 531 


& e v'(X, X) implies that # e v'.(X, X) according to a theorem of Orlicz 
[21, pp. 244-247] and Theorem 1.8. The conclusion follows from Theorem 
4.7. 


4.10. THEOREM. The general form of the linear [or completely continuous | 
transformation U(x) on X tol* (1S8q< @) is 


U(x) = {#(x)} 
where v°(X, X) [or X)] and || 


4.11. CoroLiary. If X is either weakly complete or a separable conjugate 
space, then any linear transformation on X to l' is completely continuous. 


We conclude this section with some considerations about linear transfor- 
mations on C to X. It follows from Theorem 2.10 that U(p) = f¢dx where 
eC and #e V(X) is a linear transformation on C to X with || U|| =||z||. 
Gelfand [12, p. 283] has shown that the general form of a completely con- 
tinuous transformation on C to X is 


U(¢) = f sax 


where # e V.(X) and |||| =|| U||. When X is weakly complete, Gelfand has 
shown this to be the general form of the linear transformation on C to X 
where now « e V(X). It might be added that Gelfand’s method will show this 
to be true for all conjugate Banach spaces X. 

It is easy to give an example of a linear transformation on C to X which 
does not have this general form. Let U be the identity transformation on C 
to C and suppose that it does have this form. Then $(s) = U(¢) = [o(t)dy.(s). 
As this holds for all e C, =c (s>t) and =1+¢ (s<t), which is 
contrary to ¥,(s) e C for fixed ¢. Because of the above remark, this again 
shows that C is not a conjugate space. 


The following theorem gives a characterization for linear transformations 
on Cto X: 


4.12. THEOREM. A necessary and sufficient condition that U be a linear trans- 
formation on C to X is that there exist a sequence of step functions %, @ V(X) such 
that =|] U|| and 


= lim f 


Making use of the Bernstein polynomials 


= lim — 


532 R.S. PHILLIPS [November 


in C. If we apply a device due to Hildebrandt and Schoenberg [15, p. 318], 
then 


where U,(¢) = fodxn, xn(t) being defined to have the jump U[C?f/(1—#)*7] 
at r/n and to be constant elsewhere. Now 


Therefore ||z,|| =|| <|] Ul]. In general, however, lim inf,.. || 2|| so 
that lim,.. ||%n|| =|| U||. Since #, does define a linear transformation, the suffi- 
ciency argument is obvious [1, p. 80, Theorem 5]. 

5. Linear transformations on L. In this section we obtain representations 
by means of a kernel of the general completely continuous transformation 
and weakly completely continuous separable transformation on L to an arbi- 
trary Banach space X. By means of this result and a theorem due to Dunford 
and Pettis [9], we show that U? is completely continuous whenever U on L 
to L is weakly completely continuous. Special cases of Theorems 5.3 and 5.4 
have been proved by Gelfand [12]. More recently Dunford and Pettis [9] 
have obtained special cases of Theorems 5.3, 5.4, and 5.5. 

In this and the following section, L® will be the space L*(a) where T is 
the sum of a denumerable number of sets 7 e G of finite measure. 7 will be 
defined as at the end of §3. x(t) on T to X will be said to be weakly measur- 
able if #(x(t)) is measurable for all # e X¥. We define B*(X) to be the class of 
weakly measurable point functions x(t) on T to X whose values are essen- 
tially contained in a separable conditionally weakly compact subspace of X. 
With norm 

= ess. L.U.B., [||x(¢)]]], 


B*(X) is a Banach space. The set of functions x(¢) e B“(X) which take on a.e. 
a conditionally compact set of values will comprise the subspace B®.(X) of 
B”(X). 

Integration with respect to a real valued measure function a will be real- 
ized by means of the Birkhoff integral [3]. Since x(t) for # e B”(X) is a.e. 
contained in a separable subspace of X, x(t) is integrable on all sets r e G of 
finite measure [22, Theorems 1.1 and 5.3, Corollary 5.11]. 


5.1. Lemma. If B°(X), then 


rx(t)d 
= L.U.B. 0<|7|< 
T 
Since x(t) is essentially contained in a separable subspace X’ of X, it is 
clear that there exists a denumerable set of linear functionals {#,} ¢ ¥ each 


ij 
4, 


1940] LINEAR TRANSFORMATIONS 


of norm one such that L.U.B. [| #,(x)| =||x|||, x e X’]. Therefore 


A =LU.B. 0<|r|< | 


| | 


0<|r|< 2, reG, 


Let 7, = [t| | #.(x(t))| >A]. Clearly =0. ro= ||x(4)||>A, x(t) X’] is 
contained in }>7, and hence is of measure zero. On the other hand for every 
e>0, there exists an ” such that ess. L.U.B., | £n(x(t))| >A —e. It follows that 
ess. L.U.B.,||x(¢)|] =A. 


5.2. THEOREM. V™.(X) is equivalent to B”.(X). The equivalence is defined 
by U(x(t)) =x(r) = frx(t)da on B°.(X) to V*.(X). 


It follows from Lemma 5.1 that U is an isometric transformation. 
By the definition of the Birkhoff integral, given any null set 7, 
S= [x(r)/|r||0<|r] G] is contained in the convex extension of 
[x(t)|t e T—7]. Therefore S is conditionally compact. By Theorem 1.4, 


x(r) = U(x(t)) e V(X). 


We next prove the converse. Let x(r) e V".(X). £#(x(r)) is then a com- 
pletely additive set function on all measurable subsets of any set of finite 
measure. By the Radon-Nikodym theorem there exists f,(#) e B”(R) such 
that #(x(r)) =J,f,(t)da for all r e G of finite measure. As above this de- 
fines an isometric transformation V on B*(R) to V*(R). By definition, 
[2(x(r))|# e X, ||2||<1] is conditionally compact in V*%(R). Therefore 
P=[f,(t)|# e X, ||4\| <1] is conditionally compact in B*(R). Defining U, 
as in §3, it follows from Theorem 3.7 that lim, U,(f,(¢)) =f,(¢) uniformly in P 
in the topology of B*(R). 


V(U,(fe(t))) = | r-75|. 


lim V(U,(fe(#))) = V(fe(#)) 
uniformly in P. This implies 


| 


in V*.(X). Define x,(t)=x(o;)/|o;| for te 74; and x,(r) = Then 
x-(t) e BY.(X); 


533 
Then 


R. S, PHILLIPS 


x 
«(r) = >> | | = 
| 
By (1), given e>0, there exists a 7, such that for m1, 722 7., ||*,(7) —xz,(7)|| 
se. According to Lemma 5.1, ess. L.U.B., —*-z,(t)|| Se. Since B®,(X) is 
a Banach space, it follows from Lemma 2.2 that there exists an x(t) e B.(X) 
such that lim, x,(t) =x(¢) in B®.(X). Then x(7) =lim, U(x,(t)) = U(x(#)). 


5.3. THEOREM. The general form of the completely continuous transformation 
UonLtoX is 


ule) = f 


where B®.(X) and || U|| =||4|. 


According to Theorem 4.3, U(¢)=/¢(t)dx where x(r) © V*.(X) and 
| U|| =| x(r)|| v*cx). By Theorem 5.2 there exists an x(t) e B”.(X) such that 
x(r) = /,x(t)da for all e G of finite measure and 
=|| . Further by Theorem 2.5, /,p(t)dx = J,o(t)x(t)da for all r G of finite 
measure. Now T =).7; where | and 7;:7;=0 if 1#j. Given e>0, one 
can obtain an unconditionally convergent sum of the type Di (tsx(t,)| 74] 
((r}) is a subdivision of 7;, which approximates J, (t)x(t)da to within 
e/2* and such that each of its finite partial sums is within ¢/2‘ of some 
S.p(t)x(t)da for r Since by Theorem 2.4, || da, 
it follows that the resulting subdivision of T furnishes an unconditionally con- 
vergent sum which approximates to within e. {r(t)x(t)da therefore 
exists and is equal to {ro(t)dx. 


5.4. THEOREM. The general form of the weakly completely continuous separa- 
ble transformation on L to X is 


= f $()x()da 


where B*(X) and || U|| 


According to Theorem 4.1, U(¢)=fo(t)dx where x(r) e V°(X) and 
{| U|| =||x(r)||v*<x. If x,(¢) is the characteristic function of e G for |r] @, 
then x(r) = U(x,(t)). As U is weakly completely continuous and separable, it 
is clear that S=[x(r)/|r| |0<|7| < © ] is conditionally weakly compact and 
is contained in a separable linear closed subspace Y of X. Hence there exists 
a sequence {%,} ¢ X which when considered as elements of Y are dense in 
the unit sphere of a determining manifold in Y. By the Radon-Nikodym theo- 
rem there exists for each X an f,(t) B*(R) such that £(x(r)) = 
for all r e G of finite measure. As in Theorem 5.2, let x be a finite subdivision 


534 [November 
? 


1940) LINEAR TRANSFORMATIONS 535 


of T into disjoint measurable sets (71, 7T2,-:°:, Tn), let o¢7; such that 
0< | <o, and let x,(t) =x(01)/|;| for ter; Then for every X, 
lim, #(x,(t)) =f,(t) in B°(R). There will therefore exist a set {rn} such that 
| —fe,(t)| <1/m on (|on| =0) for all Hence for each i, 
uniformly on where oo (|oo| =()). For a given t, 
%,,(t) is contained in S. A subsequence will therefore converge weakly to an 
element of Y [1, p. 134, Theorem 2]. We arbitrarily define x(t) to be the weak 
limit of one such subsequence. Clearly #,(x(t)) =lim, &:(x,,(t)) =f,,(t) on 
As | #4(x(t))| sU on and x(t) e Y, it follows that sU 
on T—go. Further #;(x(t)) is measurable, Y is separable, and the sequence #; 
is dense in the unit sphere of a determining manifold in Y if the #; are con- 
sidered as elements of Y. From this one can easily show that x(t) is weakly 
measurable. As x(#) is contained in the sequential weak closure of S, 
[x(t) |t e T | is conditionally weakly compact(*). Therefore x(t) e B*(X). Now 
4:(x(r)) = (t)da = for all r G of finite measure. As {%;} is 
total in Y, it follows that x(r) = /,x(t)da for all r of finite measure [23, Theo- 
rem 5.3]. By Lemma 5.1 ||x(¢)|] x) =|| Ul]. The remainder of 
the argument is identical to that used in Theorem 5.3. 

We remark that Theorem 5.4 is applicable to any separable linear trans- 
formation on L to a regular Banach space since the unit sphere of a regular 
space is weakly compact. 

In the following theorem and corollary, T need not be the sum of a de- 
numerable number of sets 7 e G of finite measure. 


5.5 THEOREM. If U is a weakly completely continuous transformation on L 
to an arbitrary Banach space X, then U takes conditionally weakly compact sets 
into conditionally compact sets. 


It is sufficient to show that for any conditionally weakly compact se- 
quence {¢,}, { U(¢,)} is conditionally compact. The sequence {¢n} is con- 
tained in a separable subspace L’ of L essentially defined on a class T’ ¢ T 
which is the sum of a denumerable number of sets of finite measure(!*), Let 
U’ on L’ to X be identical with U on L’. As L’ is separable, U’ is a separable 
‘weakly completely continuous transformation on L’ to X. Theorem 5.4 is 
applicable, and hence by a theorem due to Dunford and Pettis [9, p. 547, 
Theorem 4] U’ takes conditionally weakly compact sets into conditionally 
compact sets. Since {¢,} is conditionally weakly compact in L’, this con- 
cludes the proof. 


(*) W. L. Chmoulyan has shown that the weak sequential closure of a weakly compact 
subset of a Banach space is itself weakly compact. See Communications de I’Institut des 
Sciences Mathématiques et Méchaniques de I’Université de Kharkoff et la Société Mathé- 
matique de Kharkoff, (4), vol. 14 (1937), pp. 239-242. 

(#*) One can readily obtain this result by employing an argument similar to that used by 
Dunford [8, p. 644]. 


536 R. S. PHILLIPS [November 


5.6. COROLLARY. If U is weakly completely continuous on L to L, then U* is 
completely continuous. 


U takes the unit sphere in L into a conditionally weakly compact subset 
of L, and by Theorem 5.5 its iterate takes this subset into a conditionally com- 
pact subset of L. In other words U? is completely continuous. 

A uniform mean erogodic theorem for weakly completely continuous 
transformations on L to L is easily obtainable by means of Corollary 5.6 
and a mean ergodic theorem due to Kakutani [16] and Yosida [28]. 

6. On completely continuous transformations. In this section we make 
further application of our study, demonstrating that each completely con- 
tinuous transformation on any of the spaces ~), C, co, Mr to 
an arbitrary Banach space X can be approximated in the norm by degenerate 
transformations (see footnote on p. 526). The notation is that of §5. 

For (1<q< ©) we define 


and for V“(X) we define 


= | | < Ti, | < ©), 


Clearly V%.(X). 
6.1. THEorEM. If te V*.(X) (1<qS ~), then lim, ||%,—4%|| =0. 


By definition, the set [#(x(r))|||2|] $1] where # e V*.(X) is a condition- 
ally compact subset of V*(R). If we use the usual isometric correspondence 
between V*(R) and L‘, it follows immediately from Theorem 3.7 that 


lim — = lim L.U.B. — 2(4)|| | lal] 1] = 0. 


6.2. COROLLARY. If U is a completely continuous transformation on L® to X 
(1ap< oo), then 
(1) U(b) = fodx where V%.(X), 
(2) If = x(14)/|74| <p<o) or if oda 
(p =1), then lim, || U,—U|| =0 


This is a consequence of Theorems 4.3 and 6.1. 
For notational convenience, we write cy 


6.3. THEOREM. If U is a completely continuous transformation on |? 
(1SpS ~) toX, then 

(2) if Un(a) then lim, || Ul] =0. 


| 
he 


1940] LINEAR TRANSFORMATIONS 537 


For p= ©, this is a consequence of Theorems 1.7 and 4.7. For ig p<, 
the theorem is a special case of Corollary 6.2. 


6.4. THEOREM. If V'.(X), then there exists a non-negative B(r) V'(R) 
such that 


where © is of type 2. 


By the definition of V1.(X), S=[#(x(r))| |||] <1] is a conditionally com- 
pact subset of V(R). Hence there exists the sequence [,|||#,|| <1, 
n=1, 2,--- ] such that #,(x(r)) is dense in S. Let 8,(r) be the absolute 
variation of #,(x(r)). Define =>; (1/2")B,(r). Clearly B(r) e V1(R) and 
|p (r)| 2"| B(r)|. #n(x(r)) is therefore absolutely continuous with respect to 
B(r). The class of all elements of V4(R) absolutely continuous with respect 
to B(r) form a closed linear space AC(8). Let r =(r1, 72, -- - , Tn) be of type 2 
and such that B(r;) #0 (¢=1, 2,---, ). Define 

T 
= 
on AC(8) to AC(8). Then ||U,|| <1. By a theorem due to Bochner [4, pp. 
780-783] lim, U,(y(r))=y(r) for all y(r) e AC(B). By Theorem 3.7, 
lim, U,(4,(x(r))) =£#n(x(7)) uniformly in and hence 


6.5. CoroLLary. If U is a completely continuous transformation on Mr 
[or L®] to X, then 
(1) U(b) = fodx where V'.(X), 
(2) there exists a B(r) e V'(R) such that if 
J+ @ap 


then lim, || U,—U|| =0. 
This is a consequence of Theorems 4.3 and 6.4. 


6.6. COROLLARY. If U is a completely continuous transformation on C to X, 
then U is approximable in the norm by degenerate transformations. 


According to a result of Gelfand’s [12, p. 283], U(@)=j/¢dx where 
#eV.(X) and ||#|| =||U|| (see end of §2). Now lim,.,+ #(x(¢)) and lim,..- 2(x(¢)) 
exist by virtue of #(x(t))’s being of bounded variation. Since the values as- 


538 R. S, PHILLIPS [November 


sumed by x(t) form a conditionally compact set, it follows that x(¢+) and x(t-) 
are defined for all ¢. Let G be the Jordan field of sets r generated by all open 
intervals and points of (0, 1). If 7 consists of the disjoint sets [(ai, br),---, 
bi); C1, ++, Cm], define 


x(r) = — + [2(ci) — 
tml i=l 

Clearly x(r) Vi(X), =lim, D> of type 2), and || U|| 

The remainder of the argument follows from Theorem 6.4. 

We remark that in Theorems 6.1, 6.4 and Corollaries 6.2, 6.5, 6.6 the 
m-limit can be replaced by a sequential limit. 

The problem of approximating completely continuous transformations on 
X to certain spaces Y by degenerate transformations has been investigated 
by Maddaus [19]. He shows that this is possible whenever there exist de- 
generate transformations V, such that lim,... V.(y) =y for all ye Y 

Let Y be a Banach space possessing a generalized base, U, (see §3). Sup- 
pose U is a completely continuous transformation on X to Y. Then the set 
S=[U(x)|x eX, ||x||<1] is conditionally compact. By Theorem 3.7 
lim, U,(U(x)) = U(x) uniformly for all x e X for which ||x|| <1. It follows 
that lim, ||U,(U) — U||=0. As U,(U) is a degenerate transformation, this 
gives Maddaus’s result in a slightly more general form. 

7. On the extension of linear transformations. If U is a linear transforma- 
tion on X to Y and Z contains X as a proper subspace, then a linear transfor- 
mation U; on Z to Y such that U(z) = U;(z) for all z e X is called an extension 
of U. Any Banach space Y can be imbedded("") in a space of type Mr(}2). 
We will designate such a space which contains Y or its image under an equiva- 
lence by Mr2> Y. 


7.1. THEOREM. The general form of the linear transformation U on X to Mr 
1s 
U(x) = [#(x)] 
where || U|| =L.U.B. 7]. 


For every ¢ e T there exists a linear functional 4, such that &@,(a) =a/(t). 
Let #,= U(a,). Then a(t) =a,[U(x) ]=4:(x) and 


Ul] = L.U.B. [|a.[U(x)]| = |4(x)| | te 7, lla] <1] =L.U.B. | te 7). 


7.2, COROLLARY. Any linear transformation U on X to Mr has an extension 
U; on Z>X to Mr such that || U|| =|| Ui. 


(1) By an imbedding of Y into a subspace Z of M we shall mean that Y is equivalent [1, 
p. 180] to Z. 


(2) Let T=T;j, the unit sphere of some determining manifold in 7. Then U(y) =49(y) on 
Y to Mr defines an equivalence between Y and a subset of Mr. 


5 


1940] LINEAR TRANSFORMATIONS 539 


By the Hahn-Banach theorem [1, p. 55, Theorem 2], #, has a norm pre- 
serving extension %; on Z. U;(z) = [,(z)] is the required extension. 


7.3. COROLLARY. If Y is isomorphic with Mr, then any linear transforma- 
tion U on X to Y has an extension on Z to Y. 


As Y and Mr are isomorphic [1, p. 180] there exists a biunique and bi- 
continuous linear transformation V on Y into the entire space Mr. VU is 
then a linear transformation on X to Mr which by Corollary 7.2 has the ex- 


tension (VU); on Z to Mr. It is clear that V-1( VU); is the required extension 
on Z to Y. 


7.4. COROLLARY. Any linear transformation U on X to Y has an extension 
on Z>X to Y if either of the following is true: 

(1) There exists a projection transformation(") P on Z to X. 

(2) There exists a projection transformation P on Mr2> Y to Y. 


If (1) holds then U; = UP is the required extension on Z to Y. It (2) holds 
and U, is the extension of Corollary 7.2 on Z to Mr2> Y, then PU, is the re- 
quired extension on Z to Y. 

In view of Corollary 7.4, the existence of projection transformations on 
spaces M;y> Y to Y assumes importance in the study of the extension of 
linear transformations. As yet we can give only negative results in this direc- 
tion. 

Fichtenholtz and Kantorovitch [10, p. 92] have proved that there does 
not exist a projection transformation on Mr to C where T=(0, 1) and C is 
the space of continuous functions on (0, 1). Banach and Mazur [2, p. 111] 
have shown that for a separable space Y whose conjugate space is not weakly 
complete there does not exist a projection transformation on C to any im- 
bedding of Y in C. Consequently there exists no projection transformaton 
on Mr> Y to an imbedding of Y which is contained in an imbedding of C 
in Mro Y. 

If there existed a projection transformation on the space C; of functions 
on (0, 1) having only discontinuities of the first kind to C then the methods 
of Gelfand [12, p. 281] would show that the identity transformation on C 
to C could be expressed in the form U(¢) = f¢dx where « e V(X). The ex- 
ample at the end of §4 shows that this is not the case. Therefore there exists 
no projection transformation on M7>C to an imbedding of C which is con- 
tained in an imbedding of C,in Mr>C. 


7.5. There exists no projection transformation on m to c. 


If there existed a projection transformation P on m to c, then any weakly 
convergent sequence of linear functionals {dy} on ¢ corresponds to a se- 


(4) A projection transformation P is a linear transformation with the property that P? = P- 


540 R. S. PHILLIPS [November 


quence of extensions {#,=P(4,)} which is weakly convergent on m. Now 
4,(a) =a(p+1)—a(p) converges weakly to zero on c. Using the notation of 
Corollary 3.4, we have #,(x) = frx(t)dB? and >. B?(t) | —0. Since x,(a) =a,(a) 
=a(p+1)—a(p), it follows that B?(p+1) =1=—8*(p) which is contrary to 
the above. There can therefore exist no projection transformation on m to c. 


REFERENCES 


1. S. Banach, Théorie des Opérations Linéaires, Warsaw, 1932. 

2. S. Banach and S. Mazur, Zur Theorie der linearen Dimension, Studia Mathematica, 
vol. 4 (1933), pp. 100-112. 

3. Garrett Birkhoff, Integration in a Banach space, these Transactions, vol. 38 (1935), pp. 
357-378. 

4. S. Bochner, Additive set functions on groups, Annals of Mathematics, (2), vol. 40 (1939), 
pp. 769-799. 

5. S. Bochner and A. E. Taylor, Linear functionals on certain spaces of abstractly-valued 
functions, Annals of Mathematics, (2), vol. 39 (1938), pp. 913-944. 

6. N. Dunford, Integration and linear operations, these Transactions, vol. 40 (1936), pp. 
474-494, 

7. , Uniformity in linear spaces, these Transactions, vol. 44 (1938), pp. 305-356. 

8. , A mean ergodic theorem, Duke Mathematical Journal, vol. 5 (1939), pp. 635-646. 

9. N. Dunford and B. J. Pettis, Linear operations among summable functions, Proceedings 
of the National Academy of Sciences, vol. 25 (1939), pp. 544-550. 

10. G. Fichtenholtz and L. Kantorovitch, Sur les opérations linéaires dans l’espace des 
functions bornées, Studia Mathematica, vol. 5 (1934), pp. 69-98. 

11. M. Fréchet, Les ensembles abstraits et le calcul fonctionnel, Rendiconti del Circolo 
Matematico di Palermo, vol.~30 (1910), p. 19. 

12. I. Gelfand, Abstrakte Funktionen und lineare Operatoren, Recueil Mathématique, vol. 4 
(1938), pp. 235-284. 

13. M. Gowurin, Stieltjessche integration, Fundamenta Mathematicae, vol. 27 (1936), pp. 
254-268. 

14. T. H. Hildebrandt, On bounded linear functional operations, these Transactions, vol. 36 
(1934), pp. 868-875. 

15. T. H. Hildebrandt and I. J. Schoenberg, On linear functional operations and the moment 
problem for a finite interval in one or several dimensions, Annals of Mathematics, (2), vol. 34 
(1933), pp. 317-328. 

16. Shizuo Kakutani, Iteration of linear operations in complex Banach spaces, Proceedings 
of the Imperial Academy of Tokyo, vol. 14 (1938), pp. 295-300. 

17. L. Kantorovitch and B. Vulich, Sur la représentation des opérations linéaires, Com- 
posito Mathematica, vol. 5 (1937), pp. 119-165. 

18. A. Kolmogoroff, Uber Kompaktheit der Funktionenmengen bei der Konvergenz im Mittel, 
Nachrichten der Gesellschaft der Wissenschaften zu Géttingen, 1931, pp. 60-63. 

19. I. Maddaus, On completely continuous linear transformations, Bulletin of the American 
Mathematical Society, vol. 44 (1938), pp. 279-282. 

20. E. H. Moore and H. L. Smith, A general theory of limits, American Journal of Mathe- 
matics, vol. 44 (1922), pp. 102-121. 

21. W. Orlicz, Bettrdge zur Theorie der Orthogonalentwicklungen I1, Studia Mathematica, 
vol. 1 (1929), pp. 241-255. 

22. B. J. Pettis, On integration in vector spaces, these Transactions, vol. 44 (1938), pp. 277- 
304. 


1940] LINEAR TRANSFORMATIONS 541 


23. R. S. Phillips, On integration in a linear convex topological space, these Transactions, 
vol. 47 (1940), pp. 114-146. 

24. J. Radon, Theorie und Anwendung der absolut additiven Mengenfunktionen, Sitzungs- 
berichte der Akademie der Wissenschaften, Vienna, Class IIa, vol. 122 (1913), p. 1384. 

25. M. Riesz, Sur les ensembles compacts de fonctions sommables, Acta Szeged, vol. 6 (1933), 
pp. 136-142. 

26. J. Tamarkin, On compactness of the space L?, Bulletin of the American Mathematical 
Society, vol. 38 (1932), pp. 79-84. 

27. A. Tulajkov, Zur Kompaktheit im Raum L* fir p=1, Nachrichten der Gesellschaft 
der Wissenschaften zu Gittingen, 1933, pp. 167-170. 

28. Késaku Yosida, Mean ergodic theorem in Banach spaces, Proceedings of the Imperial 
Academy of Tokyo, vol. 14 (1938), pp. 292-294. 


THE INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


ON A TYPE OF ALGEBRAIC DIFFERENTIAL MANIFOLD 


BY 
J. F. RITT 


The manifolds(!) to be investigated, which are manifolds of systems of 
differential polynomials in a single unknown, possess a degree of analogy to 
bounded sets of numbers. They are manifolds which may be said “not to con- 
tain infinity as a solution”; more definitely, zero is not a limit of reciprocals 
of solutions. 

For manifolds of this type, which will be called limited, operations of addi- 
tion, multiplication and differentiation will be studied. Given two mani- 
folds(?) Dt: and Mts, their arithmetic sum is secured by completing into a mani- 
fold the totality of functions each of which is, in some area, the sum of a 
solution in 2%; and a solution in Dz. Multiplication is defined similarly. 

It turns out that if Dt, and Mz, are general solutions of equations of the 
first order, and are limited, their sum and product are limited. On the other 
hand, as is shown by examples based on the theory of the elliptic functions, 
when 2, and Pt. involve more than one arbitrary constant their limited char- 
acter may not be communicated to their sum and product; what is equivalent 
to this, as far as multiplication is concerned, is the rather unexpected result 
that the product of two manifolds may contain zero even if neither manifold 
does. 

The derivative of a limited manifold proves to be limited in all cases. 


LIMITED MANIFOLDS 


1. Let 2 be a system of forms in the single unknown y. Let us suppose 
that 2 has solutions and that it has at least one solution which is not identi- 
cally zero. The transformation z = 1/y carries every nonzero solution of 2 into 
a definite function z. There exist forms in z which vanish for every function z 
thus obtained. Let 2’ be the totality of such forms in z. It is not difficult to 
see that the manifold of 2’ is the set of the reciprocals of the nonzero solutions 
of 2, enlarged perhaps by the adjunction of z=0. 

If 2’ does not admit z=0 as a solution, we shall call the manifold of the 
original system 2 limited(*). 

2. If 2’ hasz=0 asa solution, z=0 cannot be an essential manifold for 2’. 
If it were, 2’, which is closed, would contain a form zA where A does not 


Presented to the Society, September 12, 1940; received by the editors March 20, 1940. 

(‘) For indications in regard to the general theory to which this paper attaches, one may 
consult the author’s paper in the second volume of the Semicentennial Publications of the 
American Mathematical Society. 

(?) Not necessarily limited. 

(*) If 2 admits only y=0 as a solution, its manifold will also be called limited. 


542 


4 
% 


ALGEBRAIC DIFFERENTIAL MANIFOLDS 543 


vanish for z=0. Now A would vanish for the reciprocal of every nonzero solu- 
tion of the system 2. It would thus be in 2’ and would rule out the solution 
2=0. 

Thus, if the manifold of 2 is not limited, there is a dense set of values of x 
such that, given any point a of the set, any positive integer m and any e>0, 
we can find a solution of 2 whose reciprocal is analytic at a and has a Taylor 
expansion at a in which the first m+-1 coefficients have moduli less than «. 
When the manifold of 2 is limited, no point exists which has the property, 
just stated, of the points a. 

3. Let 2 be a closed system of forms in y which admits solutions. We shall 
prove that for the manifold of = to be limited, it is necessary and sufficient that 2 
contain a form A which, considered as a polynomial in y and its derivatives, 
possesses a term in y alone, that is, a term free of the y; with i>0, which is of 
higher degree than every other term in A. 

Let the manifold be limited. We may suppose that there are solutions 
other than y=0. Then 2’, as above, contains a form 1+K with K a nonzero 
form which vanishes for z=0. Making the substitution z = 1/y in K, and clear- 
ing fractions, we obtain a form in 2 answering to the description of A. 

Conversely, let 2 contain a form A as described. If we put y=1/z in A 
and clear fractions, we secure a form B in 2’, one of whose terms, free of 
proper derivatives of z, is of lower degree than every other term. Thus, if z=0 
were in the manifold of 2’, it would be an essential manifold. This, by §2, is 
impossible. 

CONSIDERATIONS OF GENERAL THEORY 


4. We present here a theorem of a general character which will be em- 
ployed in §10. 

Let 2 bea nontrivial closed irreducible system in the unknowns 1, - - -, %¢; 
Vy °° * yp With the u; (which may be nonexistent) arbitrary and with p>1. 
Let m be any positive integer not greater than p. Those forms in 2 which in- 
volve only the u; and 1, - - - , ¥m constitute a closed irreducible system 2, 
in the unknowns just mentioned. For is 

Let m <p. Given a solution 


(1) Ui; Vm 


of 2, analytic in an area %,, there may exist an area %, contained in %, and 
a set of functions 


analytic in %:, such that (1) and (2) constitute a solution of 2 in %. In that 


case, we shall say that the solution (1) of Z,, can be completed into a solution 
of 2. 


We are going to prove that there exists a form Gin um, ,Ugi 


544 J. F. RITT [November 


which does not belong to 2m and which has the property that every solution of Lm 
which does not annul G can be completed into a solution of 2. 
5. Let 


(3) 


be a basic set of 2, A; introducing y,;. Let the order of A; in y; be r;. Let S; 
and J; be respectively the separant and initial of A;. 
We consider the system of forms 


(4) Ay Am+1 


which is a basic set of 241. Let a form L be given which is not in 2m41 and 
which is such that every y;; appearing in L has ism-+1 and jS7r;. We place 
no restrictions on the u;; in L. We shall establish a relation 


(5) R= M+ 


of the following description. The y;; in R, M and N have ism-+1 andjs7%. 
M is contained in 241. R, distinct from zero, is free of the y;,,. Thus R is 
not in 2m41. 

6. Let (4) be considered as a set of simple forms. Then (4) will be a basic 
set of a prime system(*) IT. Now LS,,4; (simple form) is not in II. Then every 
indecomposable system held by II+LZS,,,: has fewer unconditioned unknowns 
than II. There exists thus a relation (5) with all forms simple forms, R being 
distinct from zero and free of the y;,,, and M belonging to II. It remains only 
to consider the forms in (5) as differential polynomials. 

7. Let 


J ++ 


Let J be considered as a polynomial in the ym41,;, with coefficients which are 
forms in the u; and 1, -- - , ym. Not all of these coefficients can be in 2». If 
they were, J would be in 2,4:. Let H be a coefficient which is not in 2m. 

We say that any solution %;; 91, +--+, 5m Of 2» which does not annul H 
can be completed into a solution of 241 which does not annul L. 

Let a be a value of x at which all functions of x which we shall use are 
analytic and at which the above solution of 2, does not annul H. Let the 
solution be substituted into J and let numerical values then be attributed to 
the ym41,; With 7 <7m4: in such a way as to give J a numerical value, for x =a, 
which is not zero. We can then find a numerical value for the 7m4;th deriva- 
tive of ym41 which, together with x=a, etc., annuls Am4:. Referring to M 
in (5), we see that, because the remainder of M with respect to (4) is zero 
and J;, - - - , Im41do not vanish for the indicated numerical values, the values 
cause M to vanish. Hence LS,,; does not vanish for the values. This means 


(*) The us; in Il are those appearing in (4) and in L. 


4 
4 
bat 
33 


1940] ALGEBRAIC DIFFERENTIAL MANIFOLDS 545 


that the above solution of Z,, can be completed into a regular solution of (4) 
which does not annul L, so that our statement is proved. We note that the 4; 
in H have 

8. We might have taken L =1 in §7. On this basis, let K, a form not in 
2m41 Which involves only such y;; as appear in (4), be such that every solution 
of 2m41 which does not annul K can be completed into a solution of Zn+2. 
If, returning to 2, we take L = K, we find an H such that the solutions of 2», 
which do not annul H can be completed into solutions of 242. The proof of 
the theorem stated in §4 is thus easy to conclude. 


SuMS, PRODUCTS AND DERIVATIVES 


9. Let 21 and 22 be systems of forms in y, each system possessing solu- 
tions. It is possible to form, in various ways, sums y’+~y’’ where y’ and y’’, 
solutions respectively of 2; and of 22, have the same area of analyticity. The 
manifold of the system of those forms in y which vanish for all sums y’+y’’ 
will be called the arithmetic sum of the manifolds of 2; and 2:2. We define 
similarly arithmetic product, using all products y’y’’. 

Let 2 be a system of forms in y which possesses solutions. There are 
forms in y which vanish if y is the derivative of any solution of 2. The mani- 
fold of the totality of such forms will be called the derivative of the manifold 
of 2. 

Examples. lf 21 and 22 are the forms y,—1 and xy,—y respectively, the 
arithmetic sum of their manifolds is the two-parameter family of functions 
y =ax+b. The arithmetic product is the family ax*+bx. The derivative of 
the manifold of 2: is the manifold of 41. 

10. Certain solutions in the sum of two manifolds may not be sums 
y’+y’’. Such special solutions will now be examined. 

Let 2; be a nontrivial closed system of forms in the unknown u. Let 2-2 
be a similar system in v. Let A be a system in u, v, y consisting of the forms 
in 21, those in 2: and y—(u+v). Let Q be the totality of forms in u, v, y which 
hold A. One can prove that 2 contains nonzero forms in y alone. Let 2’ be the 
totality of forms in Q which are free of u and v. If Q is the intersection of 
closed irreducible systems ,---, &, then 2’ will be the intersection of 
those subsystems of the 9; which are free of u and v. 

We refer now to §4. We see that there is a nonzero form G in y alone, 
holding no essential irreducible manifold in the manifold of 2’, which is such 
that every solution of 2’ which does not annul G can be represented, in some 
area, as the sum of a solution of 2; and a solution of 2. 

Let us apply these conclusions to the systems 2; and 22 of §9, which we 
shall suppose closed and nontrivial, with the respective manifolds Dt; and Mt. 
of sum 9. Let G be a form in y, holding no essential irreducible manifold in 
M, which is such that every solution in It which does not annul G is the sum 
of solutions taken from Qt; and Mee. 


546 J. F. RITT [November 


Let 9 be any solution in 2%. Let YA’ be any area in which § is analytic. 
Let m be a positive integer and € a positive number. Let 4 be a solution in M, 
analytic in an area %; contained in Y’, § being so taken that G is not annulled 
by # at any point of %, and that #—¥ has at each point of %1 a Taylor expan- 
sion in which the first m+1 coefficients are of moduli less than e. The exist- 
ence of § is obvious. Let M1 be an area contained in %; in which 7 is the sum 
of solutions taken from 2; and Miz. We now find a second %, using an area 
W, in Wy , a larger m and a smaller e. Continuing, we see that there exists a set 
of points, dense in the area in which 4 is analytic, such that, given any point 
a of the set, any positive integer m and any e>0, there is a solution § in M which, 
for the neighborhood of a, is the sum of solutions taken from IN; and Ms, the first 
m-+1 coefficients in the expansion of ¥—¥ at a being of moduli less than e. 

A similar result holds for the product of ti and Me. 


DESCRIPTION OF RESULTS OF PAINLEVE 


11. In §12 we shall employ results of Painlevé concerning the algebroid 
character of the solution of an algebraic differential equation of the first 
order(5). While these results have received enough attention to warrant de- 
scribing them as classic, they have not thus far, to our knowledge, been given 
didactic exposition. Here, we shall limit ourselves to formulating Painlevé’s 
results in a manner which will permit us to employ them with precision(). 

Let A be an algebraically irreducible form in y of the first order, of de- 
gree min y:. Let 2 be thé area in which the coefficients in A are meromorphic. 
There figures, in the statement of Painlevé’s results, a set of points E, con- 
tained in %, which includes the poles of the coefficients in A and has no limit 
point in the interior of &. When E is removed from UY, there remains an open 
region 

Let xo be any point of A’. Let b be any finite number. Then, given any 
number yo, close to 6 and distinct from }, A has precisely n distinct solutions 
analytic at x» and assuming the value yo at xo. 

There exist, furthermore, a certain number j (depending on xo and 3) of 
equations 


(6) + yo) yo) = 0, i= | ode 


whose descriptions and roles are as follows. The a,; are functions of x and yo, 
analytic for |x—x0| <8, |yo—b| <6, where 5 is some positive number de- 
pending on xo and b. For yo close to b and distinct from }, each of the m solu- 


(*) Painlevé, Lecons sur la Théorie des Equations Différentielles Professtes a Stockholm, 
Paris, 1897, pp. 70-76. . 

(*) The matter is not very difficult to work out, starting with the indications given by 
Painelvé. It is helpful to read Schlesinger, Gewdhnliche Differenzialgleichungen, Chapter 3, where 
somewhat related questions are considered. The Weierstrass preparation theorem can be em- 
ployed to advantage. 


1940] ALGEBRAIC DIFFERENTIAL MANIFOLDS 547 


tions of A mentioned above satisfies one of the equations (6) in the neighbor- 
hood of x =x». Furthermore, every solution in the general solution(’) of A 
which is analytic at x9 and assumes the value b at xo satisfies one of the equa- 
tions (6). Again, if y(x) is a function analytic in an area contained in 
| x—xo| <6 and if y(x) satisfies one of the equations (6) with yo fixed at a 
value interior to a circle of center b and radius 6, then y(x) is a solution in the 
general solution of A. For a given yo close to b, (6) may yield, in addition to 
solutions of A which equal yo at xo, other solutions of A analytic at x». 

We now deal with solutions of A which assume large values at xo. There 
exists a g>0 such that, for | -yo| >g, A has precisely m distinct solutions, ana- 
lytic at x, which assume the value 7p at x». There exists a number h (independ- 
ent of x9) of equations 


(7) + Bui(x, + + Zo) 0, i= 1, h, 


with 6;; which are analytic for x =xo, 29 =0. Given any solution y of A, ana- 
lytic at x» and assuming there a large value yo, the function z=1/y satisfies 
one of the equations (7) with z9=1/yo. Given any function z distinct from 
zero, obtained from the equations (7) for a smalf value of zo, the reciprocal 
of z is a solution of A. 

We proceed to apply these results of Painlevé. 


LIMITED SUMS AND PRODUCTS 


12. We prove the following theorem. 


THEOREM I. Let Dt: and Ms be general solutions of forms of the first order 
in y. Let Dt and Mz be limited. Then the sum and the product of M: and Me 
are limited manifolds. 


We take first the case of the product, disposing of that case by establish- 
ing the following result(®). 


THEOREM II. Let Dt; and Me be general solutions of forms of the first order. 
Let neither Iti nor Ms have zero among its solutions. Then zero is not a solution 
in the product of M; and Ms. 


Let us assume that zero is in the product. There are values of x at which 
zero can be approximated, as in §10, by products of solutions in It; and Mee. 
We select a value xo of this type which does not belong to either of the sets € 


(7) The notion of general solution, as employed here, does not, of course, appear in Pain- 
levé’s work. 

(8) What is involved here is the following. Let I? be the product of the limited D2, and Mz 
of Theorem I. By §1, the reciprocals of the solutions distinct from zero in I, and Mz are mani- 
folds. We represent the manifolds of reciprocals, which are seen without trouble to be general 
solutions of forms of the first order, by M{ and Mi and their product by M’. A form F holds 
M’ if it vanishes for every 1/(y’y’’) with y’ in Mt, and y’’ in Ms. By §10, F will vanish for the 
reciprocal of every nonzero solution in I. Thus, if M is not limited, M’ contains zero. 


548 J. F. RITT [November 


of §11 associated with the forms whose general solutions are Dt; and Mts. 

For convenience, we use y to designate solutions in Jt; and u, similarly, 
for Dt. Let there be given a sequence of yu whose expansions tend toward 
zero at xo. From this sequence we can select a subsequence in which the val- 
ues (xo), “u(xo) tend toward definite limits, finite or infinite. We may thus, 
and shall, assume that such limits exist for the given sequence. We assume, 
as we may, that the limit of the y(xo) is zero. The limit of the u(x») will be a 
quantity c, finite or infinite. 

We treat first the case in which c is finite. 

We may suppose that all of the y satisfy a single equation (6). We write 
this equation here in the form 


(8) y™ + + am(%, Yo) = 0. 
Similarly, the u may be supposed to satisfy an equation 
(9) u" + Bi(x, + +++ + Ba(x, uo) = 0. 


Because zero is not a solution in Dts, @m cannot vanish identically‘in x for 
a small value of yo; similarly, 8, cannot vanish in x for a value of uo close to c. 
The theory of symmetric functions shows that the yu satisfy‘an equation 


(10) (yu)™™ + + + = 0 


where the y are polynomials in the a and the 8, with Ymn =o6".” Because the 
Taylor expansions of the yu approach zero, (10) must be satisfied, for yo =0, 
uo =c, by yu =0. This is not so. We have thus disposed of the case finite. 

Now, suppose that c= ©. We let z represent the reciprocals of the u. We 
may assume that the z all satisfy an equation 


(11) +--+ + 5,(x, zo) = 0. 
Then the y/z satisfy an equation 
(12) o(y/2)"? +--+ + = 0 


with ¢ which are polynomials in the a and 6, and with, in particular, 


oo = mp = an. 


We reach the contradiction that (12) is satisfied by y/z=0 for yo=2)=0. 
This concludes the proof of our statement in regard to products. 
' Continuing with Theorem I, we consider the limited Dt; and Dz, under 
the assumption that their sum is not limited. 

Using y for Dt, and u for Dt, we consider an xo, and a sequence of y+ 
for which the expansions of the 1/(y+-) tend toward the expansion of zero 
at xo(*). We shall assume, furthermore, that the sequences of values (xo), 


(*) That such a sequence exists can be shown without difficulty by the method of §10. 


{ 
+} 


1940] ALGEBRAIC DIFFERENTIAL MANIFOLDS 549 


u(x) tend toward definite limits, finite or infinite. At least one of these limits 
is infinite. Let this be so for the u(x). We suppose first that the (xo) have a 
finite limit. 

We arrange so that the y satisfy an equation (8) and the reciprocals z of 
the u an equation (11). Let 


1 
ytu 1+ yz 


We find the w to satisfy an equation +¢mp=0 with on, = 
We must thus have 6,(x, 0) =0. This produces the contradiction that Dt. is 
not limited. The case in which the y(xo) approach © is handled in much the 
same way. 


w 


EQUATIONS OF HIGHER ORDER 


13. We shall show by means of examples suggested by the theory of the 
elliptic functions that the above results cannot be extended to equations of 
the second order. 

The nonconstant solutions of 


(13) n= —e'), 
where e is any constant, satisfy the equation 
y2 — Gy’? = 0, 
whose manifold Mt, is, by §3, limited. If, in (13), we replace y by y+e, (13) 
goes over into 
(14) yi = + Sey’ + 3e'y) 


which implies, when y is not a constant, 


6 32 4 2 22 
(15) — 8y yi — By + — Ayyiye + 4y = 0 
with a limited manifold 2%. For any constant e, arbitrarily large, there are 
solutions in Dt; and Dt, respectively whose difference is e. This is enough to 


show that the theorem on sums does not hold for equations of the second 
order. 


If in (14), we replace y by 3e?/y, (14) remains invariant. Thus, for any e, 
(15) has two solutions whose product is 3e. The theorem on the product thus 
does not carry over to the second order. 

THE DERIVATIVE 

14. We prove the following theorem. 


THEOREM. The derivative of a limited manifold is limited. 


oR 


550 J. F. RITT [November 


Let M, limited, be held by a form F =y®—G with every term in G of de- 
gree less than p. We have y?=G, (F). Now y7?~'=0, (y”). Hence there is a 
relation y/?~'=H, (F) with every term in H of degree less than 2p—1. 

We arrange y?—G and ¥7”~'—H in powers of y, securing two polynomials 
ny, 


(16) +4, 


and 
(17) Boy? + + By. 


Here Ay=1 and A; is of degree less than i for i>0. Also, B, has y??~' as one 
term and its other terms are of degree less than 2p—1. Each B; with i<q is 
of degree less than 2p—g+i-—1. 

We consider the resultant, R, of (16) and (17) with respect to y. One of 
the terms of R is A{B?, that is, B?. Now B? contains y?”~"” and its other 
terms are of degree less than (2p—1)p. Consider any other term in R, 


T = kAy, +++ ApgBy, 
We have 


At least one yp is positive and at least one » is less than p. We have thus, for 
the degree d of T in the 4, 


d < + (29 — + — 1) = (2p — 1)p. 


Thus R=y??-!”4K where each term in K is of degree less than (2p—1)p. 
We note that the y; in K have i>0. Now R holds Mt. Then the derivative of M 
is held by the form obtained from R by replacing each y; appearing in R 
by yi-1. Thus the derivative of M is limited. 

15. We shall prove that, if F, in §14, is of the first order, the derivative 
of M is held by a form y*+L with L of the first order and of degree less than g. 

It will suffice to prove that the derivative of the manifold of F is held by 
a form y*+L as just described. In that proof, we may and shall assume that F 
is algebraically irreducible. 

We consider F and its derivative F; as polynomials in y and denote their 
resultant with respect to y, which is not identically zero, by R. Now R must 
involve y2 effectively; otherwise R, which holds F, would be divisible by F, 
which involves y. 

We show now that R contains a term in y; alone which is of higher degree 
than any other term in R. This will prove our statement. 

Let us assume that the terms of highest degree in R involve yz. We con- 
sider the equation R=0 as an algebraic equation for ys. It will be satisfied, 


1940] ALGEBRAIC DIFFERENTIAL MANIFOLDS 551 


for the neighborhood of y1= ©, by some series of desceriding rational powers 
of Vy, 

(18) 
with r>0, sSr and a, 8, etc., functions of x analytic in some area(!®). 

We substitute this expression for y2 into Fi, whereupon F; goes over into 
a polynomial f in y whose coefficients are infinite series in y,. We consider the 
equations F=0 and f =0 as algebraic equations for y. They must have a com- 
mon solution given by a series of descending powers of 1, effectively involv- 
ing 


(19) 


with uw a multiple of r, (19) converging for large values of y:. We assume that 

Suppose first that t>0. Then ¢<w since, when F is considered as a poly- 
nomial in y and 1, its term y” is of higher degree than every other term. 
From (19), we find by inversion, for the neighborhood of y= ©, a series of 
descending powers for y: of the type 


(20) = 
Substituting 1, as in (20), into (18), we find a series for ys 
(21) yo = 


If we replace y; and yz in F; by their expressions in (20) and (21), F; will 
vanish identically in x and y. But if we replace y; in the equation F;=0 by 
the second member of (20) and solve the resulting equation for ye, we will 
find for y2 a series in y obtained by differentiating the second member of (20) 
and replacing y: in the result by its expression in (20). The series thus ob- 
tained begins effectively with a power of y whose exponent is (2u/t) —1, which 
exceeds the first exponent in (21). 

Now suppose that t=0. Then (19) yields an expansion for y: in ascending 
powers of y—¢ of the type 


(22) uly — 


where k is a positive rational number. Substituting (22) into (18) and pro- 
ceeding as above, we find again a contradiction of the fact that s <r. 

The case of ¢<0 is handled in the same way. 

16. If F is of order r>1, we cannot infer that the derivative of M is held 
by a form y*+L with L of order not higher than r and of degree less than g. 
Let Mt be the manifold of y:—~y*. We find that M is held by 


(°) The second member of (18) may be zero. 


J. F. RITT 


2 2 
Az 4y241. 


Suppose now that M is held by B=y{+K where K, free of y, is of order not 
higher than 3 and has no term of degree as high as g. Because y‘ is effectively 
a term in B and is divisible neither by yz nor 3, B is not divisible by A. 
Hence, the resultant R of A and B with respect to ys is a nonzero polynomial 
R in y; and ye. Putting y2=~? in R, we find the contradiction that M is held 
by a form of order less than 2. 


CoL_umBIA UNIVERSITY, 
NEw York, N. Y. 


552 
4% 


